AD-775  545 


A  NEW  GRAMMATICAL  TRANSFORMATION  INTO 
DETERMINISTIC  TOP-DOWN  FORM 


Michael  M.  Hammer 


Massachusetts  Institute  of  Technology 


Prepared  for: 

Office  of  Naval  Research 


February  1974 


DISTRIBUTED  BY: 


mmm 


Hatkmal  Technical  information  Service 
U.  S.  DEPARTMENT  OF  COMMERCE 

5285  Port  Royal  Road,  Springfield  Va.  22151 


BEST 

AVAILABLE  COPY 


BIBLIOGRAPHIC  DATA 
SHEET 


1.  Report  NoS.  NSF-0CA-GJ3  467 1  + 
N00014-70-A-0362-0001,  TR-119 


A  New  Grammatical  Transformation  Into  Deterministic 
Top-Down  Form 


7.  Author(s) 

Michael  M.  Hammer 


9.  Performing  Organization  Name  and  Address 

PROJECT  MAC;  MASSACHUSETTS  INSTITUTE  Or  TECHNOLOGY 
545  Technology  Square,  Cambridge,  Massachusetts  02139 


5.  Report  Date  .  js&ue(j 

February  1974 


8.  Performing  Organization  Rept 

No‘  MAC  TR-119 


10,  Project/Task/Work  Unit  No. 


11.  Contract /Gram  No.S* 

GJ34671  and  N00014- 
70-A  0362-0001 


13.  Type  of  Report  4  Period 
Covered  :  Interim 

Scientific  Report 


12.  Sponsoring  Organization  Name  and  Address 

Office  of  Naval  Research  Associate  Program  Director 

Department  of  the  Navy  Office  of  Computing  Activities 

Information  Systems  Program  National  Science  Foundation 
Arlington,  Va  22217  Washington,  D.  C.  20550 


15.  Supplementary  Notes 

Ph.J\  Thesis,  M.I.T.  ,  Department  of  Electrical  Engineering,  August  1973 


16.  Abstracts 

Although  deterministic  top-down  parsing  is  an  attractive  parsing  technique, 
the  grammars  to  which  it  is  applicable  (the  LL(  grammars)  are  but  a  small  subset 
of  the  LR(k)  grammars,  those  that  can  be  parsec  Inistically  bottom-up.  In 

this  thesis,  the  problem  of  transforming  LR(k)  gj  _n'.o  equivalent  LL(k) 

grammars  is  studied. 

A  new  transformation  procedure  is  devised  which  is  more  powerful  than  currently 
available  techniques  and  which  preserves  the  compiling  ability  of  the  granmar. 


17.  Key  Word*  and  Document  Analysis.  17c.  Descriptors 


Compilers 


Deterministic  grammars 
Parsing 

Syntax  Analysis 


1/fc.  Identifiers/Open-Ended  Terms 


17e-  COSATI  Fiekl/Group 


18.  Availability  Statement 

Unlimited  Distribution 


KOBM  NTIS-J5  LEV  J-tJI 


Reproduced  hv 

NATIONAL  TECHNICA! 
INFORMATION  SERVICE 

U  S  Department  of  Com meftf 
Springfield  VA  22151 


Security  (  lass  (This 
Page 

UNCLASSIFIED 


THIS  FORM  MAY  RE  REPRODUCED 


UlCOWOr 


/. 


MAC  TR-119 


A  NEW  GRAMMATICAL  TRANSFORMATION  INTO 
DETERMINISTIC  TOP-DOWN  FORM 


Michael  M.  Hammer 


February  1974 


D  D  C 

B 


This  research  was  supported  in  part  by 
the  National  Science  Foundation  under 
research  grant  GJ-34671,  and  in  part  by 
the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  under  ARPA 
Order  No.  433  which  was  monitored  by  ONR 
Contract  No.  N00014-70-A-0362-0001 . 


MASSACHUSETTS  INSTITUTE  OF  TECHNOLOGY 
PROJECT  MAC 


IDGE 


DfoUiIBuTION  ?  i  f.'n  f  A 


MASSACHUSETTS  02139 


Approved  for  public  ideas®; 
Distribution  Unlimited 


2 


A  NEW  GRAMMATICAL  TRANSFORMATION  INTO  DETERMINISTIC  TOP-LCJN  FORM 

BY 

MICHAEL  MARTIN  KAmriCR 


Submitted  to  the  Department  of  Electrical  Engineering  on  August  24,  1973  in 
partial  fulfillment  of  the  requirements  for  the  Degree  of  Doctor  of  Philosophy. 


ABSTRACT 

Although  deterministic  top-down  parsing  is  an  attractive  parsing  technique, 
the  grammars  to  which  it  is  applicable  (the  LL(k)  grammars)  are  hut  a 
c~all  subset  of  the  LR(k)  grammars,  those  that  can  be  parsed  deterministically 
bottom-up.  In  this  thesis,  the  problem  of  transforming  LR(k)  grammars  into  equi¬ 
valent  LL(k)  grammars  is  studied. 

A  new  method  of  parsing,  called  multiple  stack  parsing,  is  introduced.  This 
is  a  generalization  of  LR(k)  parsing,  containing  a  minimal  infusion  of  top-down 
predictive  techniques  into  deterministic  bottom-up  parsing.  An  automaton,  the 
MSP(k)  machine,  which  parses  strings  in  this  way  is  formally  defined,  and  is 
shown  to  be  equivalent  to  the  canonical  LR(k)  parsing  machine.  A  transformation 
procedure  is  described  which  constructs  from  M,  a  particular  kind  of  MSP(k)  ma¬ 
chine  for  the  grammar  G  (called  a  cycle-free  machine),  a  new  derived  grammar 
T  (G).  The  grammar  T  (G)  generates  the  same  language  as  the  grammar  G  does  and 
is  (strong)  LL(k)  as  well.  No  translating  ability  is  lost  in  effecting  this 
transformation,  in  the  sense  that  T (G)  can  support  the  same  class  of  compila¬ 
tion  activities,  or  syntax-directedMtranslatiops,  that  G  can. 

The  class  of  grammars  which  can  be  parsed  by  cycle-free  MSP(k)  machines  and 
so  are  amenable  to  transformation,  strictly  includes  both  the  LL(k)  grammars, 
and  the  LC(k)  (left  corner  parsable)  grammars.  Thus  our  transformation  is  more 
powerful  than  the  previously  available  one  of  Rosenkrantz  and  Lewis.  Furthermore, 
there  are  good  algorithms  applicable  to  many  transformable  grammars  for  construc¬ 
ting  a  cycle-free  MSP(k)  machine  and  making  the  entire  transformation  process 
more  efficient;  and  the  size  of  T  (G)  can  be  systematically  reduced  without 
sacrificing  its  desirable  qualities. 
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CHAPTER  ONE 
INTRODUCTION 

1 . 1  Background  and  Statement  of  the  Problem 

For  some  time,  it  has  been  recognized  that  context-free  grrinmars  form 
a  good  meta-language  for  the  syntactic  description  and  specification  of  pro¬ 
graming  languages.  And  over  the  past  decade,  there  has  been  a  developing 
trend  toward  eschewal  of  ,'hand-coding"  the  syntactic  analysis  phase  of  tne 
compilers  for  such  languages.  Rather  the  tendency  has  been  toward  the  so- 
called  syntax-directed  compiling  schemes,  wherein  a  stylized  representation 
of  th'j  grammatical  specitlr.a*  ion  of  the  language  is  directly  used  as  data  by 
some  general  purpose  parsing  scheme  to  process  programs  in  the  language. 
Whatever  speed  is  lost  by  having  the  parsing  done  by  a  general  processor  is 
offset  by  the  ease  of  implementation  of  such  a  compiler  and  the  fact  that  its 
clean  dosign  makes  It  easily  comprehensible  and  amenable  to  change  should  the 
occasion  arise,  furthermore,  the  speed  problem  has  become  l"*ss  pronounced  as 
a  variety  of  fast  parsing  schemes  have  beer,  developed,  each  applicable  to  some 
particular  class  of  grammars. 

Thi3  development  has  been  further  encouraged  by  burgeoning  interest  in 
compiler-compilers,  or  automatic  translator  writing  systems.  To  build  a  com- 
pi.er  with  such  a  system,  a  language  designer  submits  to  the  compiler-compiler 
a  syntactic  specifier  lion  of  his  language  in  BNF  and  a  semantic  sped*  .cation 
of  the  language,  whith  conmonly  associates  with  each  rule  of  the  grammar  an 
"action  routine,"  specifying  how  an  instance  of  that  rule  is  to  be  processed 
by  a  compiler.  The  system  th_n  tailors  its  general-purpose  table-driven  par¬ 
ser  to  the  supplied  gr^amar,  hooks  in  the  semantic  routines,  and  cut  comes  a 
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compiler  for  Che  language. 

The  catch  in  all  this  is  that  the  grammar  devised  by  a  language  designer, 
though  reflecting  his  perception  of  the  language  and  its  features,  and  also 
possibly  of  gre^  descriptive  value,  might  not  be  amenable  to  parsing  by  what*’ 
ever  particular  tools  and  methods  a  compiler  writer  lias  available.  A  gram¬ 
matical  description  which  is  good  for  people  is  not  necessarily  good  for  com¬ 
puters,  for  uae  in  an  implemented  compiler  for  the  language.  On  the  contrary, 
the  more  constrained  and  idiosyncratic  a  class  of  grammars  is,  the  more  com¬ 
pact  and  efficient  a  compiler  designed  on  its  defining  principles  can  be;  yet 
the  more  *:cialized  cuch  a  class  is,  the  less  likely  it  is  that  an  arbitrary 
human-dee.  gned  gramnar  will  satisfy  the  characteristic  conditions  of  that  class. 
For  example,  tnc  general  parsing  method  of  Knuth  for  the  full  class  of  Ut(k) 

(or  deterministically  parsable  in  a  left-to-right  manner)  grammars,  yields  a 
parser  much  too  large  to  be  of  real  practical  value.  A  great  many  subclasses 
of  this  general  class  have  been  devised,  each  with  its  own  specific  parsing 
algorithm;  most  of  these  are  sunmarised  in  [1  ].  However,  it  is  all  too  easy 
for  a  grammar,  though  meeting  the  fairly  lax  and  reasonable  requirements  of 
the  class  of  LR(k)  grammars,  to  fail  to  meet  all  the  restrictions  of  one  of 
these  smaller  classes.  And  very  often,  the  failure  is  not  for  any  systemic 
reason;  that  is,  there  often  is  another  grammar  for  the  same  language  which 
does  satisfy  the  appropriate  constraint,  to  enable  parsing  in  the  desired 
manner.  It  is  just  that  the  particular  grammar,  as  designed  by  the  language 
designer,  fails  to  pass  all  the  tests. 

This  situation  is  particularly  acute  in  the  case  of  deterministic  top- 
down  parsing.  This  (s  a  mode  of  parsing  that  has  long  been  known,  ha<*  proven 
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to  have  a  variety  of  attractive  features,  and  yet  has  fallen  irto  disuse  for 
quite  some  time.  This  is  primarily  due  to  the  fact  that  most  grammars  do 
not  meet  the  strict  requit  aments  under  which  this  method  can  be  applied.  For 
example,  if  t> sre  is  even  ore  left-recursive  nonterminal  in  the  grammar  (as 
is  frequently  the  case),  then  cne  grammar  fails  to  be  ieterministically  top- 
down  parsable. 

This  is  a  truly  unfortunate  state  of  affairs,  for  deterministic  top- 
down  parsing  is  indeed  a  most  attractive  technique.  The  grammars  amenable 
to  such  processing  have  been  termed  the  LL(k)  grammars  by  Lewis  and  Stearns 
[13],  and  have  been  studied  by  many  authors  [12 t  18 t  23 f  11 t  20],  In  par¬ 
ticular,  the  latter  two  are  good  surveys  of  the  field.  In  Stearns' 

words,  the  two  chief  advantages  if  LL(k)  parsing  (in  particular,  LL(1)  par¬ 
sing),  are: 

1)  parsing  can  be  carried  out  by  a  one  state  deterministic 
pushdown  machine  using  very  simple  logic; 

2)  the  parsing  algorithm  accommodates  very  flexible  language 
translation. 

L'L(k)  parsing  proceeds  in  a  very  direct  anc  efficient  manner,  which  mimics  the 
leftmost  derivation  of  the  program  in  the  grammar.  When  a  nonterminal  is  on 
top  of  the  parsing  stack,  examination  of  k  symbols  of  lookahead  resolves  the 
decision  as  to  which  grammar  rule  is  to  be  applied  to  that  nonterminal;  the 
nonterminal  is  then  replaced  on  the  stack  by  the  right-hand  side  of  the  rule, 
and  the  process  continues  (When  a  terminal  symbol  is  on  top  of  the  stack, 
it  is  matched  against  the  next  input  symbol  and  both  are  eliminated.)  A  par¬ 
ser  constructed  or  these  principles  can  be  made  to  run  very  rapidly;  and  its 
design  is  no  simple  and  clean  that  it  should  be  easy  to  implement.  In  addition. 
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an  IiL(k)  parser  is  in  a  much  better  position  to  give  meaningful  diagnostics 
and  to  attempt  error  recovery  when  presented  with  an  illegal  program,  than 
is  an  LR(k)  parser;  because  while  the  LR(k)  parser  merely  becomes  stymied 
when  it  encounters  the  error,  the  LL(k)  parser  knows  the  current  goal,  (i.e., 
what  it  is  trying  to  find),  and  can  thus  report  somewhat  more  precisely  as 
to  the  nature  of  the  error. 

Furthermore,  since  the  identity  of  the  rule  to  be  applied  is  determined 
by  an  LL(»0  parser  before  all  symbols  generated  by  that  rule  have  been  read 
(this  is  in  conti ast  to  bottom-up  parsing  schemes),  it  is  possible  to  deter¬ 
mine  what  semantic  routine  is  to  be  applied  before  its  arguments  have  already 
been  read  and  processed.  That  is,  actions  can  be  executed  before  the  end  of 
a  rule  has  been  reached,  and  co  it  is  exceptionally  easy  to  coordinate  sem¬ 
antic  processing  of  a  general  kind  with  a  top-down  parser. 

Lewis  and  Rosenkrantz  [14]  have  recently  reported  on  a  compiler  for 
ALGOL  60,  which  uses  an  LL(1)  parser,  and  which  was  designed  in  a  structured 
way  based  on  the  principles  elucidated  above.  This  compiler  achieves  both 
speed  in  compiling  (not  just  in  parsing)  ar.d  a  clean  and  understandable  design. 
Similar  principles  have  been  applied  in  the  rapid  construction  of  an  efficient 
and  well-designed  FORTRAN  compiler.  A  compiler-compiler,  which  constructed 
LL(k)  parsers  for  the  grammars  supplied  it,  could  reasonably  be  expected  to 
produce  good  compilers.  Thus  it  is  regrettable  that  most  grammars  fail  to  be 
LL(k)  and  are  consequently  unsuitable  for  deterministic  top-down  parsing. 

In  order  to  apply  LI (k)  parsing  techniques  more  often,  we  would  like  to  be  able 
to  apply  some  transformations  to  non-LL(k)  grammers  and  convert  Chemlnco  LL(k)  form  . 
Even  if  the  transformed  grammar  were  to  have  a  very  different  structure  from, 
and  bear  no  apparent  resemblance  to,  the  original  graumar,  it  would  net  matter; 
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for  the  original  grammar  could  still  continue  to  serve  its  definitional  and 
pedagogic  functions,  while  the  scope  of  the  new  grammar  would  be  confined  to 
the  compiler  implementation.  The  only  point  for  caution  would  be  that  the 
transformation  should  preserve  translatability  (in  the  sense  of  syntax- 
directed  translations).  That  is,  any  actions  performed  by  semantic  routines 
operating  in  conjunction  with  a  bottom-up  parser  for  the  original  grammar, 
should  be  implementable  by  semantic  routines  working  with  the  LL(k)  parser 
for  the  transformed  grammar.  If  this  condition  is  met,  it  is  possible  to 
envision  a  new  stage  of  a  compiler-compiler,  being  a  transformation  pha3e. 
This  step  would  convert  a  non-LL(k)  grammar  into  LL(k)  form,  simultaneously 
transforming  the  associated  semantic  specification;  it  would  pass  the  new 
grammar  to  the  main  body  of  the  translator  writing  system,  which  would  be 
designed  to  construct  LL(k)  compilers  for  the  grammars  provided  to  it. 

In  fact  there  has  been  a  certain  amount  of  research  along  these  lines, 

[6  ,  19,  20],  but  most  of  it  lias  concerned  variants  of  the  same  basic  prin¬ 
ciple,  a  common  transformation  to  rid  the  grammar  of  left-recursive  nonter¬ 
minals.  This  transformation  has  been  described  by  a  number  of  authors  [ 

17  ,22  ],  and  we  shall  call  it  the  Rosenkrantz  transformation.  The  transfor¬ 
mation  is  rather  complex,  but  basically  it  changes  rules  A  _»  Ab  and  A  -*  a 
into  rules  A  -*  a(A,  A);  (A,  A)  -♦  b;  and  (A,  A)  -♦  £.  Here  (A,  A)  is  a  new 
nonterminal  that  stands  for  "the  rest  of  an  A  after  an  A;"  the  idea  is  that 
if  A  generated  Aa  in  the  old  gramma.,  (A,  A)  will  generate  a  m  the  new  one. 
Foster's  Syntax  Improving  Device  [ 6  ]  (as  well  as  Wood's  improved  version  of 
it  [24])  basically  consists  of  the  application  of  this  transformation,  pos- 
ibly  followed  by  some  left  factoring.,  (Left  factoring  is  a  process  of  trans- 
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forming  rules  A  -4  BC  and  A  -*  BD  into  A  -»  BX,  X  -♦  C,  and  X  -»  D. )  Tliese  two 
transforming  methods  (possibly  used  in  conjunction  with  substitution  are 
those  cited  by  Stearns  [20]  as  being  the  tricks  moat  widely  used  in  attempting 
to  create  an  equivalent  LL(k)  granmar,  given  a  grammar  which  is  not  LL(k).  At 
the  present  time  however,  our  understanding  of  these  transformations  remains 
at  a  rudimentary  level;  and  the  method  of  their  application  usually  con¬ 
sists  of  heuristics  and  guesswork.  There  ia  little  even  in  the  way  of  good 
strategies  for  the  use  of  these  transformations.  In  short,  the  entire  issue 
of  transforming  grammars  into  LL(k)  form  is  not  well  understood,  and  the 
current  state  of  the  art  is  largely  trial-and-error. 

The  purpose  of  this  thesis  is  to  contribute  to  the  state  of  knowledge 
and  understanding  of  the  process  of  transforming  granmars  into  LL(k)  form. 

We  shall  introduce  a  new  transformation  which  will  be  applicable  to  a  wide 
class  of  grammars;  and  we  shall  be  able  to  determine  if  an  arbitrary  gram¬ 
mar  is  in  this  class.  Furthermore,  our  transformation  will  proceed  in  a  de¬ 
terministic  manner,  and  we  shell  investigate  the  nature  of  its  process, 

both  on  formal  and  intuitive  levels.  And  so  as  to  be  of  practical  value, 
the  transformation  will  preserve  translatability,  in  the  sense  outlined  above. 

In  short,  we  hope  to  make  a  contribution  towards  rationalising  the  process 
of  finding  equivalent  LL(k)  forms  for  non-LL(k)  granmars. 

We  see  the  possible  value  of  our  work  coming  from  two  sources.  On  the  one 
hand,  it  should  serve  to  further  the  sum  of  knowledge  in  this  area  and  perhaps  lead 
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t«.  i  full  and  complete  solution  to  the  problem  At  the  same  time,  we  can  envision  the 
Implementation  of  an  actual  grammar  transformer,  embodying  tha  principles  of 
th»  transformation  we  shall  devise.  We  shall  keep  both  of  these  goals  in 
mind  throughout.  Hie  bulk  of  our  work  is  of  a  formal  ana  precise  nature, 

hoping  to  avoid  the  vague  and  often  unsatisfying  approach  frequently  taken  to 
work  of  this  kind.  Yet  we  shall  also  strive  to  provide  intuitive  explanations 
of  our  development,  and  we  shall  not  neglect  the  issu^.t  of  time  and  space 
which  assume  primary  importance  in  the  context  of  an  implementation  effort. 

We  shall  also  Include  some  less  formal  and  semi-heuristic  suggestions  as  to 
how  our  transformation  may  be  accomplished  most  efficiently  in  many  cases. 

There  lies  been  some  rese  Tch  in  the  spine  just  adumbrated.  In  [19  1, 
Toscnkrantz  and  Lewis  have  characterized  the  class  of  granmars  that  can  be 
converted  into  LL(k)  form  by  a  single  application  of  the  Rosenkrantz  trans¬ 
formation.  They  denote  these  grammars  as  the  LC(k)  grammars;  these  gram¬ 
mars  are  distinguished  by  the  fact  that  they  can  be  parsed  in  a  peculiar  hy¬ 
brid  of  top-down  and  bottom-up  parsing,  known  as  the  left-comer  boctom-up  par¬ 
sing  merhod.  This  technique  has  been  known  for  some  time  and  was  used  by  a 
variety  of  authors  tr.  early  compiling  systems  [  3  ,  7,9  ] »  For 
a  long  time  this  technique  was  thought  of  as  being  pure  bottom-ui  [11),  but 
deeper  understar  >ing  of  it  has  shown  it  to  be  a  mixture  of  the  two  major  classes 
of  parsing  scher.it s.  Basically,  eft  corn _r  parse  consists  of  doing  a  bottom- 
up  parse  in  order  to  find  th*  left  r  of  a  rule  (the  first  symtoi  on  its 

right  hand  side),  and  then  switching  over  to  a  predictive  scheme  to  determine 
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the  rest  of  the  rule.  In  t'nelr  paper,  Rosenkrantz  and  Lewis  formally  define 
the  LC(k)  grammars  as  those  which  are  amenable  to  being  parsed  In  such  a 
fashion.  They  define  a  formal  automaton  (the  canonical  left  comer  push¬ 
down  machine)  as  a  model  for  the  actions  of  such  a  parser,  and  they  argue 
thac  such  a  machine,  designed  to  parse  a  specific  LC(k)  grammar,  does  indeed 
succeed  in  recognizing  precisely  the  language  of  that  grammar.  Then  they 
use  this  machine  model  and  its  relatiorship  to  a  machine  model  for  LL(k) 
parsing  to  show  that  a  grammar  is  LC(k)  if  and  only  if  applying  the  Rosen- 
krantz  transformation  to  it  creates  an  LL(k)  gramnar.  Finally  they  show 
that  a  large  class  of  translations  on  the  original  grammar  are  also  express- 
ibl-  as  translations  on  the  transformed  grasmar. 

To  our  knowledge,  thi3  is  the  only  firm  and  theoretical  work  v  *  the  prob~ 
lem  of  transforming  non-LL(k)  granmars  into  equivalent  LL(k)  forms.  (However, 
in  the  only  published  version  of  this  work,  the  proofs  are  omitted,  presumably 
because  of  their  length  and  difficulty.)  We  have  taken  it  as  our  mandate  to 
further  the  work  of  Rosenkrantz  and  Lewis,  and  to  provide  additional  insight 
into  the  transformation  process  in  a  precise  and  rigorous  way.  We  shall  use 
their  earlier  work  as  a  jumping-off  point  for  our  research  in  several  ways. 
First  of  all,  we  shall  try  to  proceed  in  the  same  rigorous  fashion.  Secondly, 
we  shal\  take  it  as  our  mission  to  find  a  new  method  of  transformation  which 
will  supersede  the  Rosenkrantz  transformation  and  which  will  be  applicable 
to  a  class  of  granmars  which  strictly  includes  the  LC(k)  granmars.  And  fin¬ 
ally,  we  shall  make  use  of  a  similar  methodology;  namely,  ve  shall  devise 
an  even  more  mixed  and  general  hybrid  scheme  of  parsing,  develop  an  automaton 
model  for  it ,  and  show  how  this  machine  may  be  used  to  derive  a  new  LL(k)  form 
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for  the  original  gransnar  for  which  the  automaton  was  constructed. 

We  shall  concentrate  on  the  problem  of  transforming  LR(k)  grammars  into 
LL(k)  form.  (Although  not  generally  stated,  most  of  the  other  work  In  the 
field  has  also  been  subject  to  this  same  restricted  viewpoint.)  That  Is,  we 
shall  devise  an  algorithm  for  taking  a  gramu,*:  supplied  by  a  language  de¬ 
signer,  that  could  as  it  stands  be  parsed  In  a  deterministic  bottom-up  manner, 
and  changing  It  iito  an  equivalent  LL(k)  granular.  There  are  several  reasons 
for  limiting  our  Interest  to  LR(k)  grammars  as  opposed  t'  arbitrary  context- 
free  grammars  First,  the  argument  against  top-down  parsing  has  always  been 
that  ■nany  granmars  which  could  be  parsed  iu  other  deterministic  ways  could 
not  be  handled  by  an  LL(kl  parser.  But  If  a  grammar  Is  not  lit(k),  then  there 
Is  no  way  to  parse  it  deterministically  on  a  pushdown  machine,  so  the  added 
*•  .arge  that  It  is  uot  LL(k)  is  not  very  serious.  It  Is  just  for  those  gram¬ 
mars  that  can  be  parsed  in  some  other  way  that  we  want  to  f'nd  top-down  forms, 
and  thereby  scotch  the  criticism  of  detractors  of  LL(k)  parsing.  Furthermore, 
it  is  too  difficult  to  get  a  handle  on  an  arbitrary  context-free  granmar;  their 
general  nondeterministic  parsing  methods  do  not  allow  2or  si.  plehybridization  .  And 
finally,  it  is  not  asking  too  much  for  a  language  designer  to  couch  his  de¬ 
scription  in  a  deterministic  way.  Arguments  have  been  made  thac  an  IH(k) 
grammar  is  a  natural  mode  of  expression,  and  that  if  a  grammar  fails  to  meet 
its  criteria,  there  is  some  structural  defect  in  the  language,  some  ambiguity 
of  syntax  or  semantics.  Thus  we  shall  restrict  our  attention  to  m(k)  gram¬ 
mars  throughout  the  thesis. 

1.2  Outline  of  the  Thesis 

Chapter  2  consists  primarily  cf  a  review  of  the  standard  terminology 
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and  notation  ve  shall  be  using.  We  do  introduce  some  new  terms,  and  we  con¬ 
centrate  on  the  concepts  of  IP.£  '  parsing.  Ws  define  a  particular  version 
of  the  lit(k)  pavsing  machine  which  we  shall  find  most  convenient. 

Chapter  3  is  devoted  to  the  development  and  explication  of  our  particu¬ 
lar  new  hybrid  of  top-down  and  bottom-up  parsing.  This  technique  can  be  easily 
summarized.  It  essentially  consists  of  a  standard  left-to-right,  bottom-up 
p.irse,  except  that  the  parser  has  the  ability  to  make  predictions  as  to  the 
identity  of  nonterminals  that  it  is  going  to  find.  That  is,  at  any  point 
during  the  parse  the  parser  may,  based  on  inspection  of  some  lookahead  sym¬ 
bols,  predict  that  some  prefix  of  the  remaining  input  is  going  to  be  reduced 
to,  say,  an  A.  This  is  known  as  predicting  an  A.  However,  once  the  A  is  pre¬ 
dicted,  parsing  continues  in  the  normal  boi tom-up  fashion.  We  can  envision 
the  prediction  of  an  A  as  entailing  the  Invocation  of  a  separate  parsing  ma¬ 
chine,  whose  goal  is  to  reduce  an  input  string  to  the  an  A  (in  con-’ 

trast  to  the  main  parser,  whose  goal  is  to  reduce  the  input  to 

an  S).  When  this  A-parser  (which  operates  in  a  pure  bottom-up  manner)  has 
completed  its  tesk  by  reducing  some  prefix  of  the  remaining  input  to  an  A, 
it  returns  control  to  the  main  parser,  which  picks  up  where  the  other  one 
left  off.  This  then  is  our  new  model  of  a  parsing  machine  called  an  MSP(k) 
machine  (for  multiple  stack  parsing):  a  collection  of  bottom-up  parsers 
which  can  d  ,cide  to  call  each  other  (or  themselves)  if  inspection  of  the  look¬ 
ahead  warrants  it.  We  note  that  the  order  of  recognition  by  su-.h  a  scheme  is 
exactly  that  of  conventional  bottom-up  parsing.  To  be  precise,  an  MSP(k)  ma¬ 
chine  for  G  does  the  same  things  as  an  LR(k)  machine  for  G,  except  th  it 
takes  more  steps  to  do  it  in  (namely,  making  the  predictions);  and  initially. 
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this  technique  seems  to  be  of  no  particular  interest. 

Chapter  3  is  devoted  to  laying  all  the  fundamental  ground  work  necessary 
for  the  use  of  these  ideas.  Sections  3.1  and  3.2  contain  a  detailed  des¬ 
criptive  introduction  to  the  concepts  of  this  mode  of  parsing.  Section  3.3 
'.attains  the  formal  definition  of  the  notion  of  prediction  in  bottom-up  par- 
Finally  in  Section  3.4,  we  introduce  the  formal  model  of  an  MSP(k) 

[.S'* -ring  machine,  a  generalization  of  the  LR(k>  machine,  as  an  automaton 
which  accommodates  the  making  of  predictions  during  bottora-up  parsing.  We 
show  in  Se-tion 3.5  that  every  MSP(k)  n-a< hine  for  G  can  be  obtained  from  the 
lR(k)  machine  for  G,  by  the  repeated  .application  of  a  relatively  simple  st^te- 
splitting  algorithm.  In  Section  3.6,  we  prove  the  expected  result  that  an 
HSP(k)  machine  for  G  does  indeed  recognize  precisely  the  language  generated 
by  G,  and  that  it  parses  such  string,  in  the  conventional  bottom-up  manner. 

In  Chapter  4  we  restrict  our  attention  to  a  particular  kind  of  MSP(k) 
machine,  called  a  cycle-free  MSP(k)  machine;  and  we  show  how  to  derive  an 
LL(k)  grammar  from  such  a  cycle-free  machine.  In  Chapter  5  we  address  the 
problem  of  constructing  such  a  cycle-free  MSP(k)  machine  from  the  LR(k) 
machine  for  a  grammar  G.  The  combination  of  these  two  procedures  is  our 
transformation  process:  the  construction  of  a  cycle-free  machine  and  then 
the  derivation  of  a  granmar  from  it. 

A  cycle-free  MSP(k)  machine  is  one  such  that  each  component  submachine 
contains  no  cycles;  hence  each  component  machine  needs  only  a  finite  stack. 
Given  a  cycle-free  machine  M  for  G,  our  transformation,  set  forth  Section 
4.2,  derives  a  new  granmar  T  (G).  The  nature  of  this  grammar  is  heavily  de- 
pendent  on  the  structure  of  M.  We  give  a  technique  for  naming  the  states 
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of  a  cycle-^*  machine;  these  names  serve  as  the  nonterminals  of  the  derived 
grammar.  The  transitions  among  the  states  are  schema  for  the  rules  of  the 
grammar . 

Once  we  have  defined  the  transformed  grammar  T^(G),  we  establish  its 
relationship  to  the  original  grammar  G.  First  of  all,  we  show  that  T^(G)  gen¬ 
erates  the  same  language  -hat  H  recognizes;  we  do  this  by  es'abllshing  a 
correspondence  between  ‘.leftmost  derivations  lr.  T^(G)  and  sequences  of  con¬ 
figurations  of  M.  This  then  establishes  that  L(T^(G))  =  L(G).  Since  H  is 
defined  to  be  deterministic,  this  means  that  the  derived  grammar  is  deter¬ 
ministically  top-down  parsable;  i.e.,  LL(k).  A  summary  of  the  ideas  of  this 
transformation,  and  some  Insight  into  its  nature,  is  provided  in  Section 
4.6. 


In  the  final  sections  of  Chapter  4,  we  de-eraphasize  the  formal  aspects 
of  1^(G),  and  concentrate  on  its  utilitarian  value.  First  we  establish  that 
Tjj(G)  is  just  as  useful  as  G  i r,  m  terms  of  supporting  translations;  i.e., 
when  it  comes  to  directing  compiling  activities  associated  with  the  appro¬ 
priate  parser,  inen  we  address  the  problem  of  bringing  the  size  of  T^(G) 
(which  reflects  the  size  and  complexity  of  M),  down  to  more  manageable  pro¬ 
portions,  without  sacrificing  either  its  LL-ness  or  its  compiling  ability. 

In  Section  5.1,  we  study  the  class  of  grammars  which  can  be  parsed  by 
cycle-free  MSP(k)  machines,  and  which  therefore  we  can  transform  into  equi¬ 
valent  LL(k)  grammars.  We  show  that  this  is  a  decidable  class,  so  it  is 
possible  to  determine  it  a  given  grammar  is  transformable.  Then  we  prove  that 
this  class  of  gramnars  strictly  contains  both  the  LL(k)  and  LC(k)  grammars. 
This  means  that  we  have  achieved  our  goal  of  generalizing  the  work  of  Rosen- 
krantz  and  Lewis. 
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However,  even  though  the  decision  procedure  fcr  this  class  is  construc¬ 
tive  (i.e. ,  it  produces  a  cycle-free  machine  for  G  if  one  exists),  it  Is  ter¬ 
ribly  unsophisticated  and  inefficient.  In  Section  5.2  we  discuss  a  number  of 
semi-heurictic  strategies  fo*  the  efficient  construction  of  cycle- free  machines; 
these  techniques  do  not  work  for  all  transformable  gramnars,  but  they  are  ap¬ 
plicable  to  a  very  vide  class  of  grammars  which  includes  all  gramnars  of  rea¬ 
sonable  character.  These  methods  wou?d  be  of  great  use  to  a  person  or  a 
compiler' compiler  attempting  to  apply  our  transformation.  Although  the  bulk 
of  our  work  is  formal  and  theoretical,  the  techniques  of  this  section, and  those 
of  4.8,  bring  this  transformation  into  the  realm  of  practicality,  and  make 
its  implementation  a  feasibility. 

Finally,  in  Chapter  6  we  examine  some  of  the  peripheral  Issues  and  ques¬ 
tions  raised  during  the  progress  of  this  research.  Theae  deal  with  luch  top¬ 
ics  as  various  possible  generalizations  of  our  work  and  speculations  on  some 
other  kinds  of  transformations. 


CHAPTER  2 


FUNDAMENTAL  CONCEPTS 


A  (context-free)  granmar  Is  a  4-tuple  G  =  (Vu,  V  ,  S,  P),  where  V„  ant! 
are  disjoint  sets  of  nonterminal  and  terminal  symbols;  S,  the  sentence 
symbol,  is  a  member  of  V^;  snd  F  is  a  set  of  productions  or  rules  of  the 
form  A  -*  a,  where  A  £  and  a  €  0^  U  V^,)*.  An  A-rule  is  any  rule  A  -»  a. 

In  general,  we  shall  use  the  following  notation  with  reference  to  a  gram¬ 
mar  G.  Lower-case  letters  like  a,  b,  c  denote  elements  of  V^;  symbols  x, 
y,  w,  T,  p  generally  represent  strings  i'  7  *;  A,  B,  X  are  nonterminals, 
elements  of  V^;  a,  p,  y»  <p»  *  represent  strings  in  (V^  U  VT>*;  and  o  is  any 
symbol  from  U  V^.  We  use  g  to  stand  for  the  empty  string.  When  we  de¬ 
part  from  these  conventions  or  use  other  symbology,  we  shall  explicitly  quan¬ 
tify  our  usage. 

If  o!  £  V^*,  then  |u|  is  the  length  of  <J,  the  number  of  symbols  in  it. 

The  length  of  f  is  0,  If  u  £  VT*,  t£len  is  that  x  such  that  |x  J  =  k  and^  =xy 
for  some  y;  if  then  w/k  =  w.  is  the  set  of  u>  c  VT  such  that  |w|  *  k. 

FIRST1  (a)  *  (x  |  a  =»  xy  for  some  y  and  jxl=*  k} ;  FOLLOW^ (a)  *  {x  |  S  =»  aPxy 
for  some  a  and  y  and  |x|  =  k't . 

We  define  the  relation  g  over  strirgs  in  (V  U  V^)*  as  follows:  a  g  p 
if  and  only  if  a  =  a,  A  a2»  P  “  y  O^,  and  A  -*  y  is  a  rule  of  G.  (We  shall 

generally  omit  explicit  mention  of  G  when  the  reference  Is  clear.)  The  re- 

*  * 
lation  g  is  the  reflexive  transitive  closure  of  g;  if  a  =»  P,  we  say  a  pro¬ 
duces  or  generates  p.  We  say  ®i  A  fl2  ^  ai  ^  a2  i£  a2  ^  VT*'  A  CL,  =» 

*  * 

if  a.  £  V  *;  *»  and  are  the  reflexive  transitive  closures  of  ^  and  =>,  If 
X  T  R  L»  K  L 
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* 

a  =*  p,  then  there  ,'s  a  sequence  an,  a . a  such  that  a  *  a_,  P  -  a  ,  and 

u  i  in  u  m 

0^  =»  this  sequence  Is  called  a  derivation  of  f  from  a.  Hie  length  of 

this  derivation  is  m;  in  such  a  case,  we  say  a  °  p.  If  a»  ?  C.  ?  ?  a  , 

U  Li  i  Li  L  m 

call  it  a  leftmost  derivation;  and  similarly  for  a  rightmost  derivation. 

The  language  of  a  grammar  Gf  1(G),  is  [x  g  V^*  \  S  -=»x) .  It  is  well  known  that 

k  k  k  k 

S»x  iff  S  jX  iff  S^x.  We  say  a  is  a  sentential  formifSoa;  a  left  sentential  form 
* 

if  S  =»ct;  and  similarly  for  a  right  sentential  form  .Say  s  *CCq  2a2  L  ’  *  *  L  n  aw  *s 

a  leftmost  derivation  of  w;  let  ,  where  -*  is  a 

rule  of  G.  Then  if  A^  .»  is  rule  p^,  then  the  sequence  of  rules  p^,  p^, 

p»,  . ..,  p  .is  called  the  top-down  (It ft-to-right  parse)  of  u.  On  the  other 
x  n- 1 

hand,  if  S  '  ao  R  ttl  R  a2  R  R  ttn  “  U  ‘  fc^en  t*ie  3e9uence  °f  rules  Pn_j» 

p  0-  p  -,...,  p_,  p, ,  p_  is  the  botton-up  (left-to-right)  parse  of  w.  The 
n-  2  n- 2  2  10 

parsing  problem  is  given  <■<,  to  find  a  toj -down  or  bottom-up  parse  of  w. 

The  LL(k)  grammars,  defined  by  Lewis  and  Stearns  [13],  is  the  largest 
clas ;  of  graranars  for  which  c.  deterministic  one-pass  algorithm  exists  to  find 
the  left-to-right  top-down  parse  of  w. 

lc 

Definition  Let  k  >  0.  A  grammar  G  is  LL(k)  iff  given  >■)  €  VT  »  A  €  VN»  and 

u  £  V.j*,  there  is  at  most  one  production  p  such  that  for  some  y  and  \  r,  V^*, 
k  k 

S  =*  uAv,  A  =>y  beginning  with  production  p,  and  (yv)/k  «*  to. 

k 

Deflniticn  G  is  strong  LL(k)  iff  given  w  £  and  A  £  V  ,  there  is  at  most 

k  k 

one  production  p  such  that  for  some  u,  y,  and  v  in  V^*,  S=»uAv,  A  =»  y  begin¬ 
ning  with  production  P,  and  (yv)/k  =  w.  (Equivalently,  if  A  -*  and  A  -*  cpj  are 
different  A-rules,  then  FIRST^jFOT.TOW^A))  D  FIRSTj^FOLIOW^A))  -  0.) 

For  any  LL(k)  grammar,  it  is  possible  to  find  an  equivalent  strong  LL(k) 
grammar.  There  is  a  very  simple  parsing  procedure  for  any  strong  LL(k)  gram- 
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ma/,  which  uses  &  pushdown  stack.  The  Input  to  the  algorithm  is  the  string 
x  to  be  parsed.  The  procedure  begins  with  the  stack  containing  only  S.  At 
any  point  in  the  procedure,  either  a  nonterminal  or  terminal  symbol  will  be 
on  top  of  the  stack.  If  the  top  is  a  terminal  symbol,  it  is  compared  against 
the  next  input  symbol:  if  there  is  a  match,  the  symbol  is  p  'poed  off  the 
stack  and  removed  from  the  input;  otherwise,  there  is  an  error.  If  a  non¬ 
terminal  A  is  on  top  of  the  stack,  then  we  set  u  equal  to  the  next  k  symbols 
of  the  input.  By  the  definition  of  strong  LL(kl  grammar,  the  combination  of 
A  and  io  specify  a  single  rule  p  of  the  form  A  -*  a.  This  rule  1c  the  next 
one  in  the  parse;  to  continue  the  procedure,  we  remove  A  from  the  stack  ^rtd 
replace  it  by  a. 

The  Ut(k)  grammars  o.v  Knuth  [10 '  are  the  largest  class  of  grammars  for 
which  there  exists  a  one-pass  algorithm  to  find  the  left-to-right  bottom-up 
parse  of  x.  There  are  a  variety  of  equivalent  definitions  of  this  class.  We 
give  one  provided  by  Lewis  and  Stearns  [13 ] . 

Definition  G  is  IR(k)  if  it  is  unambiguous  and  if  for  all 

•k  *  *  ,  , 

in  VT*  and  A  in  VN,  S  =»  A  -'3,  A  w 2*  S  **  “i0,2W3  »  and  “3^  “  “3  ^  ^“P1? 

*  • 
that  S  =»  A  <*>3  . 

This  means  that  the  part  of  a  right  sentential  form  last  introduced  in¬ 
to  it  (the  handle),  can  be  determined  by  looking  ahead  k  symbols  after  it. 

The  construction  of  i\  parser  for  LR(k)  grammars  is  somewhat  more  complex., 
and  we  describe  it  more  carefully.  There  are  several  variants  of  the  construc¬ 
tion  of  this  parser;  our  formulation  is  similar  to  that  of  Aha  and  Ullman  [  1  ], 
but  differs  in  certain  respects. 
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Definition  .f  G  is  an  IH(k)  grammar,  then  A  -»  a.  P(<»>)  is  an  LR(k)-item  over 
G  if  A  -»  ap  is  a  rule  of  G,  M  £,  and  w  £  .  Furthermore,  if  A  4  ^  is  a 

rule  of  G,  then  A  -*  .^(w)  ir  an  If (k)- item  over  G. 

Definition  If  I  =  A  -♦  a. a  P(“)  is  an  LR(k)-item  over  G,  then 

i)  I  is  an  A-item 

ii)  T  is  a  . a- item 

iii)  if  o  6  V^,,  then  I  is  a  terminal  item;  furthermore, 

A  -♦  .£(<*>)  is  a  terminal  item 

iv)  a  is  the  post-dot  component  of  I 

v)  ct  P  is  the  post-dot  section  of  I 

vi)  is  rhe  context  of  I 

vii)  xf  a  ^  f,  I  is  an  essential  item 

•ff 

Informally,  the  language  of  an  item  A  -»  a. P(w)  is  (x|  p  10  =»  x},and  its 
k- language  is  FIRSTk(P  u). 

Definition  The  relation  "immediate  descendant"  on  UR(k)-items  Is  defined  by 
B  -*  .y(o’)  is  an  immediate  ascendant  of  A  -♦  a.B  p  (t)  if  w  g  FIRSTk(p  t). 

The  relation  "descendant"  is  the  reflexive  transitive  closure  of  "immediate 
descendant."  Item  I,  is  an  ancestor  of  iff  Ij  is  a  descendant  of  1^. 

Definition  The  completion  of  an  item  I  is  the  set  of  all  items  which  are 
descendants  of  I.  The  completion  of  a  set  of  items  is  the  union  of  the  com¬ 
pletions  of  the  individual  items. 

This  concept  has  been  called  the  closure  by  Aho  and  Ullman  [  1  ]. 


Definition  The  completion  of  the  nonterminal  A  is  the  completion  of  the  set 
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of  ncn- essential  A-itecis.  The  completion  of  A  in  the  concext  t*>  is  the  comple¬ 
tion  of  all  non-essential  A-items  whose  context  is  w  (i,®.,  all  items  of  the 
fora  A  -»  .a(w)). 

Before  giving  the  parsing  algorithm,  we  make  a  comment  about  end- 

markers.  We  shall  assume  that  every  input  string  we  shall  be  consider 
ln8  is  followed  by  H  ,  k  copies  of  the  right  padding  symbol.  This  is 
to  ensure  that  in  looking  ahead  k  symbols  into  the  input,  any  parser  will  always 
have  at  least  k  symbols  to  look  at;  and  that  FIRST^  end  FOLLOW^  are  well-defined. 

Our  model  of  an  LR(k)  parser  for  G  is  driven  by  a  finite-state  machine¬ 

like  device  called  the  canonical  LR(k)  machine  for  G.  This  machine  consists 
of  a  set  of  non-final  slates  (described  below)  and  a  set  of  final  states,  one 
for  each  rule  of  G.  The  entry  of  the  machine  to  the  final  state  for  A  -»  a 
during  the  course  of  a  parse  of  x,  indicates  that  a  has  just  been  located  as 
the  handle,  and  it  is  to  be  reduced  to  A.  In  this  way,  the  sequence  of  final 
states  of  the  machine  entered  curing  a  parse  provides  the  sequence  of  rules  of 
G  that  form  the  bottom-up  parse  of  x  in  G. 

The  states  of  an  LR(k)  machine  are  connected  by  a  transition  function; 

this  function  f  takes  a  non-final  state  q,  a  symbol  a,  and  a  lookahead  of  k 

symbols  w t  into  a  state  (final  or  non-final).  At  any  point  during  the  parse, 
some  state  of  the  machine  will  be  current,  the  parser  will  huve  some  symbol 
in  hand,  and  there  will  be  a  lookahead  consisting  of  the  next  k  input  symbols 
that  have  not  yet  been  read;  the  transition  function  will  compute  from  these 
three  arguments,  what  is  to  be  the  next  current  state  of  the  machine. 

But  the  LR(k)  parser  does  more  than  just  go  through  states  of  the  LR(k) 
machine;  it  also  has  some  side  effects  on  a  stack  used  during  the  parse.  This 
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stack  contains  symbols  of  the  grammar  alternating  with  names  of  states  of  the 
machine;  at  any  point,  the  topmost  stack  element  will  be  a  state  name,  that 
cf  the  current  state  of  the  machine.  If  this  Is  a  non-final  state,  then  the 
parser  reads  the  next  symbol  of  input  ontc  the  top  of  the  stack;  the  next 
state  is  computed  from  the  current  state,  the  symbol  just  read,  and  the  next 
k  symbols  of  input.  This  next  state  name  is  put  on  top  of  the  stack,  and  the 
process  continues.  If  the  current  state  of  the  machine  Is  a  final  state,  then 
it  is  associated  with  some  rule  of  G,  say  A  -♦  a.  Then  it  will  be  the  case 
that  the  topmost  section  of  the  stack  consists  of  the  symbols  of  a  alterna¬ 
ting  with  state  names.  The  parser  then  proceeds  to  pop  off  the  stack  all  these 
symbols,  dcwn  through  the  first  symbol  of  a.  This  exposes  some  state  name  q 
as  the  stack  top.  Then  A  is  put  on  top  of  q,  and  q *  is  put  on  top  of  A, 
where  q*  is  computed  from  q.  A,  and  the  next  k  symbols  of  input.  This  is  the 
next  configuration  of  the  parser,  and  the  process  continues. 

The  LR(k)  parser  begins  with  the  stack  containing  just  q^,  which  is  the 

name  of  the  starting  state  of  the  machine,  and  with  the  remaining  input  being 
k 

x^  ,  where  x  is  the  string  to  be  parsed.  The  parser  recognizes  x  if  it  even¬ 
tually  reaches  a  configuration  where  the  stack  contains  just  qQ  S  and  the  re¬ 
maining  input  is 

We  see  then  that  an  LR(k)  parser  consists  of  a  standard  driver,  which 
uses  the  particular  LR(k)  machine  for  G  to  find  the  bottom-up  parse  for  strings 
in  L(G).  The  non-final  states  of  the  LR(k)  machine  for  G  are  each  comprised 
of  a  set  of  LR(k)  items  over  G.  The  states  of  this  machine  and  the  transition 
function  connecting  them  are  defined  in  the  followir.0  ly. 

1)  The  completion  of  S  in  the  context  is  the  starting 
state  of  the  machine. 
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2)  Let  q  be  any  non-final  state  of  the  machine.  For  any 

symbol  a,  consider  the  '.terns  of  q  of  the  form  A  a. oP(m) 
where  P  ?*  £.  Consider  the  set  of  corresponding  items 

A  -*  aa.  p(uj.  Tnen  there  is  a  non-final  state  q *  which  is 
the  completion  of  the  set  F^.  And  we  set  f(q,  a,  x)  =  q' 
if  there  is  some  item  A  -*  a.  cP(u)  in  q  such  that  x  £ 
FIRST^(ptJ);  q*  is  called  a  o-successor  of  q. 

3)  If  A  a.  c0*>)  is  an  item  of  q,  then  f(q,  j,  u)  is  the 
final  state  associated  with  A  -»  a  a. 


Repeated  application  of  this  definition  eventually  stops  creating  new 
non- final  states,  and  the  result  is  the  LP(k)  machine  for  the  grammar.  Ob¬ 
serve  that  the  set  E  of  step  2)  forms  the  set  of  essential  items  of  the  new 

o 

state;  so  every  non-final  state  of  the  machine  is  precisely  equal  to  the  com¬ 
pletion  of  its  essential  items. 

It  is  well  known  that  it  is  possible  to  compute  FIRST^Cc.)  for  any  LR(,k) 
gramnar.  Therefore  it  is  possible  to  compute  the  completion  of  any  set  of 
items,  and  hence  the  definition  of  LR(k)  machine  just  given  can  be  used  to 
compute  the  machine. 

Firthermore,  it  is  possible  to  prove  the  following  two  results. 

Fact  If  q  is  any  non-final  state  of  the  LR(k)  machine  for  an  LJR(k)  grammar 

k 

G,  the;  for  any  a  f  V  ,  at  most  one  of  f(q,  p,  au/k)  and  f(q,  a,  w)  is  de* 
fined. 


Fact  If  q  is  any  non-final  state  of  the  LR(k)  machine  for  an  LR(k)  grammar 

k 

G,  then  for  any  a  U  V^,  and  w  £  v  ,  f  (qv  o,  w)  is  single-valued. 

These  facts  assure  us  that  the  parsing  algorithm  we  outlined  previously 
is  Indeed  sensible  and  can  be  formalized,  and  that  the  parsing  algorithm  op¬ 
erates  deterministically.  It  can  also  be  shown  that  this  algorithm,  using 
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this  machine,  does  Indeed  successfully  perform  the  left-to-right  bottom-up 
parse  of  strings  In  L(G).  That  le.  It  recognizes  Jusc  the  strings  In  L(G) , 
and  the  sequence  of  final  states  It  enters  while  parsing  x  defines  the  bottora- 
up  parse  of  x  In  G. 

There  s..«i  several  refined  models  of  the  Iil(lc)  parsing  machine  which  can 
often  be  used  to  guide  the  parsing,  but  the  machine  construction  detailed 
above  produces  the  "canonical"  LR(k)  parsing  machine  for  G. 

We  shall  represent  fJl(k)  machines  as  follows.  A  non-final  state  Is  de¬ 
noted  by  a  rectangle  containing  Its  defining  Set  of  Items;  a  final  state  Is 
an  oval  containing  the  name  of  the  rule  with  which  It  is  associated.  If 
f(q,  or,  w)  “  q#,  then  an  arrow  Is  drawn  from  q  to  q'  labelled  by  a/w;  If 
,t\q,  a,  w')  *  q*  as  veil,  the  transition  is  labelled  by  a/w ,(*)'. 

As  an  example,  the  canonical  LR(1)  machine  for  the  grammar  S  -*  AS, 
a  -♦  b,  A  -♦  Aa,  A-»b,  is  given  in  the  figure  below.  The  states  numbered 
purely  for  convenience.  State  1  is  the  initial  state  of  the  machine  and 
state  2  is  the  only  other  non-final  state. 


26 


A/a,b 


S  .*  .AS(-I) 
S  -4  .bfrl) 
A  4  .Aa(b) 
A  .Aa(a) 
A  ■*  .  bfh) 
A  4  .bia) 


A/a,b 


-♦  A.a(b) 
-♦  A.a(a) 
-*  A.S(-») 
4  .ASH) 
-♦  .b(-i) 
■4  ,Aa(b) 
4  .Aa(a) 


The  successive  stages  of  the  parse  of  babb  using  this  machine  is  given 
below.  We  show  the  sequence  of  stacks,  with  the  remaining  input  at  each  stage 
beneath  it.  This  sequence  demonstrates  that  babb 


I'TVIXI 


is  Indeed  in  the  language  generated  by  the  grmanar,  and  the  sequence  of  final 
states  entered  does  correctly  give  the  bottom-up  parse  of  this  string. 
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CHAPTER  3 

HIEDICTION  IN  BOTTOM- UP  PARSING: 
STATE-SPLITTING  AND  MULTIPLE  STACK  PARSING 


I  Basic  Ideas  of  Prediction  and  State-Splitting 

Prediction  Is  a  concept  usually  associated  with  top-down  parsing, 
rather  than  bottom-up.  The  entire  process  of  top-down  parsing  consists 
of  nothing  mora  than  making  a  sequence  of  predictions  and  then  seeing 
them  come  true.  Given  a  nonterminal  and  a  lookahead  string,  we  predict  which 
rule  of  the  grammar  must  be  applied  to  the  nonterminal  in  order  to 
effect  the  eventual  generation  of  the  lookahead  string.  After  matching 
leading  terminals  on  the  right-hand  side  of  the  rule  with  an  initial 
segment  of  the  lookahead,  we  reapply  the  procedure  to  the  first  nonter¬ 
minal  on  this  right-hand  side.  Thus  we  are  always  predicting  the  occur¬ 
rence  of  a  nonterminal  In  a  specific  lccation  in  the  parse  tree,  as  an 
ancestor  of  some  segment  of  the  sentence  being  parsed,  before  we  know 
exactly  how  that  segment  Is  generated  from  the  nonterminal. 

On  the  surface,  this  concept  seems  completely  alien  to  the  approach 
embodied  in  bottom-up  parsing.  There,  nothing  whatsoever  is  predicted, 
no  anticipation  of  the  future  ever  plays  a  role  in  the  progress  of  a  parse. 

It  is  only  when  we  have  reached  the  end  of  the  handle  that  we  recognize  it  as 
such  and  discover  to  which  nonterminal  it  Is  to  be  reduced.  We  do  not 
make  use  of  information  that  we  have  gained  in  the  past  in  order  to  restrict 
future  activities.  But  in  one  important  sense,  there  is  an  air  of  prediction 
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to  the  entire  proceedings.  Before  the  bottom-up  parse  of  a  sentence  begins. 
we  are  effectively  predicting  that  we  shall  reduce  it  to  S«  the  sentence  symbol. 
The  entire  parse  is  predicated  on  this  assumption;  and  when  the  prediction  is 
fulfilled,  and  an  S  has  been  located  by  successive  reductions  of  the  input, 
then  the  parse  is  complete.  We  will  generalize  this  approach  and  utilize 
prediction  not  only  at  the  beginning  of  a  bottom-up  parse,  but  at  selected 
points  throughout.  That  is,  at  various  times  we  shall  pi -diet  that  the  bottom- 
up  parser,  proceeding  in  a  normal  fnshion,  will  eventually  reduce  some  prefix 
of  the  remaining  input  to  a  specified  ..onterminal.  Putting  it  in  terms  of  the 
Ilt(k)  machine  directing  a  parse,  we  shall  see  how  to  utilize  predictive  tech¬ 
niques  net  just  at  the  initial  state  of  the  machine  (where  S  is  effectively 
being  predicted),  but  upon  entry  to  other  states  in  the  machine. 

Before  we  develop  the  formal  theory  of  this  method  of  parsing,  some 
informal  discussion  is  in  order,  to  describe  the  model  underlying  the  develop¬ 
ment  and  provide  some  intuition  as  to  the  issues  involved  end  the  problems 
that  will  arise.  The  first  thing  we  must  understand  is  how  a  state  of  an 
LR(k)  machine  is  used  during  the  parse  of  a  sentence  by  that  machine.  For 
the  time  being,  we  shall  be  restricting  «-‘ur  discussion  to  so-called  ’'canonical" 
LR(k)  machines,  but  we  shall  see  later  how  our  work  applies  to  the  more 
efficient  parsers  of  DeRemer,  and  others. 

Suppose  then  th«it  we  ha\e  just  entered  some  non-final  state  q  of  an  IR(k) 
machine  during  che  parse  of  a  sentence  by  the  machine,  and  that  at  this  time 
the  remaining  input  string  is  u  =  a^a^.-a^.  The  next  step  in  the  parse  will 
be  to  read  onto  the  stack  and  transfer  to  seme  state  q^.  The  symbol  a^  will 
be  the  post-dot  component  of  some  of  the  items  of  q.  After  further  symbols 
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have  been  read  and  possibly  some  reductions  made  along  the  way,  the  machine 

will  enter  the  final  state  associated  with  one  of  these  . a^  items.  Let  us  assume 

that  this  item  of  q  is  non-essential,  of  the  form  A,  -*  .ajX^(T^).  Then  we 

have  reduced  a.^.  .  .a^  for  some  i  to  Xj,  and  will  now  reduce  a^. ..a^  to 

(with  the  lookahead  being  T^).  The  machine  will  pop  the  stack  exposing 

and  will  then  fellow  an  A^"transition  to  the  state  q2»  Again,  A^  will  be 

the  post-dot  symbol  of  some  items  of  q,  and  eventually,  we  will  arrive  at  the 

final  state  associated  with  one  of  these  items ,  again  assumed  to  be  of  the  fora 

^2  ^ T2 ^ ^  this  piint,  we  have  reduced  a^+^...aj  for  some  j  to  x2, 

and  are  about  to  -educe  A^x2  to  A,,,  the  lookahead  being  t^.  We  once  again 

return  to  q  and  follow  an  A2  transition,  eventually  performing  the  reduction 

A^  -*  A2X3’  w^-t^  lookahead  t^.  This  process  continues  until  we  perform  a 

reduction  associated  with  an  essential  item  of  q.  That  is,  until  we  reach 

the  final  state  for  B  -*  PA  x  ,  where  B  -*  P.Ax  ,,(T  was  an  essential 

item  of  q.  At  this  point,  we  wl!l  perform  the  appropriate  reduction  but  will 

not  return  to  q  but  rather  to  a  state  of  which  q  was  a  p-successor . 

Assume  that  the  remaining  input  string  at  the  time  the  reduction  to  An 

is  made  is  a. ......  a  ;  that  is,  by  that  time  we  have  read  a,...  a..  Let  us 

J&+1  m  Lx, 

see  what  has  been  accomplished  since  our  originc^  entry  to  q.  We  have 

constructed  the  section  of  the  parse  tree  shown  in  Figure  3.1.  First  we 

red*  ced  some  initial,  segment  of  u  to  A^;  then  A,  and  another  segment  to  A2; 

and  ao  on,  until  we  had  reduced  a,... a.  to  A  ,  the  post-dot  compoient  of  an 

1  *  n 

essential  item  of  q.  After  the  recognition  of  each  A^,  we  returned  to  q  to 

find  the  next  state  to  go  to;  our  final  return  was  made  after  the  finding  of 

A  .  In  some  sense,  cur  mission  upon  entry  to  q,  was  to  locate  the  post-doc 
n 

component  of  one  of  its  essential  itemc.  Once  this  had  been  accomplished, 
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Figure  3.1 
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we  no  longer  continued  to  return  to  q  to  compute  the  next  state  to  go  to. 

Thiv,  iask  was  performed  by  recognizing  a  sequence  of  non-essential  Items  of 
q,  the  recognition  of  one  being  the  first  step  In  the  recognition  of  the  next. 
Each  of  these  recognized  nonterminals  is  an  ancestor  of  a  progressively  longer 
initial  segment  of  w,  until  all  of  a^...a£  has  been  reduced  to  A^. 

But  there  is  a  further  relationship  among  the  rules  *♦  Recall 

that  each  was  und ar lying  the  nonessential  item  A^+j  . A^  Xj^(t  of  q.  By 
the  order  in  which  the  reductions  were  made,  it  is  easy  to  see  that  A^  -♦ 

*^i-l  Xi^Ti^  *s  an  descendant  of  A^+^  *♦  ,A^x^+^  (t  in  the  state 

q.  In  auxiliary,  then,  we  know  upon  entry  to  the  state  q,  that  some  initial 
segment  of  the  remaining  input  string  will  be  reduced  to  the  post-dot  compo¬ 
nent  of  one  of  the  essential  items  of  q;  and  that  this  reduction  will  be 
accomplished  by  the  successive  reductions  of  noneatential  items  of  q,  each 
of  which  is  an  ixxnediate  descendant  of  the  next  one. 

How  can  we  use  in  advance,  this  information  as  to  what  of  necessity 
is  going  to  occur  once  we  alter  q?  Suppose  that  all  the  essential  items  of 
q  have  the  same  post-dot  component,  namely  the  nonterminal  A.  Then  no  matter 
which  of  these  items  the  parser  is  really  "working  on",  no  matter  which  of 
the  underlying  productions  will  ultimately  be  recognized,  we  know  for  certain 
that  before  we  leave  q  for  the  last  time,  we  will  have  reduced  some  initial 
segment  of  the  remaining  input  to  an  A.  This:  then  is  the  function  of  state 
q:  to  find  an  A.  vfe  can  predict,  upon  entry  to  q,  that  an  A  is  going  to  be 
found;  and  once  that  A  has  been  found,  we  car  leave  q  behind. 

To  take  a  more  complex  example,  consider  the  1R(0)  state  q  of  Figure  3.2. 
Here  there  are  two  essential  items,  with  different  post-dot  components,  so 
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8e<3Bingly  neither  ore  could  be  predicted. if  we  were  to  enter  q  during  a  parse. 


A  -*  x.By1 
C  -*  x.Dy2 
a  -»  .Ey3 

D  -*  -^4 

E  -»  .Py5 
E  -»  .0y6 
P  -»  .y? 

G  -♦  *y8 


Figure  3.2 


Ifcit  even  though  neither  B  nor  D  can  be  accurately  predicted  upon  entry 
to  q,  it  i_8  possible  to  predict  that  some  initial  segment  of  the  remaining 
input  will  be  reduced  to  an  E.  For  whether  the  ultimate  reduction  is  to  B 
or  D ,  a  prior  reduction  must  be  to  an  E,  since  the  only  items  with  B  and 
D  on  the  left-hand  side  have  E  as  their  post-dot  component.  We  can  not  pre¬ 
dict  how  the  remaining  parse  will  begin:  it  might  start  with  G  -♦  yg  or  F  -*  y-j 
nor  can  we  predict  the  final  nonterminal  to  be  found  by  q:  it  could  be  B  or 
D.  But  somewhere  along  the  way,  the  parser  will  find  an  E,  pop  the  stack  to 
expose  q,  and  then  go  to  an  E-successor  of  q.  Thus  we  see  that  in  certain 
situations  (to  be  characterized  later),  it  is  possible  to  predict  and  identify 
a  specific  nonterminal  which  will  be  recongized  at  some  future  time  in  the 


parse 
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Let  It  be  stressed  that  this  is  a  prediction  about  bottom-up  parsing, 
that  is,  though  we  prc  .ict  that  an  E  will  be  found,  the  way  in  winch  that  E 
will  be  constructed  will  be  by  the  further  actions  of  the  standard  bottom-up 
IH(O)  parser. 

Is  there  any  way  to  utilize  this  predicted  information?  Let  us  use  a 
programming  analogy.  An  lH(k)  machine  as  a  whole  can  be  thought  of  as  a 
main  program,  with  the  stack  as  its  V'  ing  space.  Its  goal  is  to  find  an  S, 
to  reduce  the  input  string  to  the  sentence  symbol.  Thus  entry  to  the  initial 
state  of  the  machine  is  analogous  to  calling  the  main  routine.  Now  suppose 
it  were  possible  to  predict  an  A  in  state  q  of  the  machine.  Then  entering 
state  q  would  be  similar  to  calling  a  subroutine  whose  goal  would  not  be  the 
reduction  of  the  entire  input  to  an  S,  but  that  of  reducing  some  initial 
segment  of  the  remaining  input  co  an  A.  Upon  completion  of  the  task  assigned 
it,  the  subroutine  would  retjm  control  to  the  main  program  at  the  point  from 
which  it  was  called;  a'ter  that,  the  main  program  could  carry  on  as  if  it  it3elf 
had  found  the  A  in  the  normal  ex  post  facto  fashion,  rather  than  having  predic¬ 
ted  its  occurrence  and  dispatched  the  task  of  finding  it  to  the  subroutine. 

The  subroutine  itself  would  operate  in  much  the  same  way  as  the  main  program, 
performing  a  bottom-up  parse;  except  that  it  knows  to  quit  when  it  has  found 
an  A;  and  let  us  further  say  that  it  has  its  own  work  space,  its  own  stack. 

Note  that  the  subroutine  returns  no  value  or  information  to  the  calling  routine, 
but  merely  signals  its  completion.  Thus  the  order  of  parse  remains  unaffected 
by  the  added  feature  of  predicting  in  advance  some  of  the  results  of  the  bottom- 
up  parse. 

But  to  complete  the  analogy,  how  do  we  design  the  subroutine  to  recognize 
the  A?  If  the  full  LR(k)  machine  is  the  main  program,  what  is  the  subprogram? 
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Let  us  say  It  will  be  a  smaller  machine,  in  seme  sense  a  submachine  of  v.he 
original  one.  Its  construction  will  be  much  like  that  of  any  lR(k)  machine; 
its  states,  sets  of  IA(k)-items.  However  its  initial  state,  rather  than  being 
the  completion  of  all  S-items,  will  be  the  completion  of  all  A* items.  As  a 
first  approximation,  we  can  think  of  the  new  machine  being  the  LR(k)-parsing 
machine  for  the  subgrammar  of  the  original  grammar  which  has  A  as  its  starting 
symbol.  Successors  of  states  in  the  submachine  will  be  computed  as  usual, 
and  the  machine  will  operate  like  any  other  on  its  own  stack.  The  only 
difference  will  be  that  there  will  be  an  A**transj.tion  from  the  starting  state 
to  a  designated  state  entitled  POP;  and  when  POP  is  entered  during  a  parse, 
it  causes  the  submachine  to  erase  its  stack  and  return  control  to  the  main 
machine.  Thus  the  submachine  knows  to  suspend  operation  when  Its  job  is 
completed. 

How  is  the  submachine  to  be  called  from  the  main  machine,  and  how  does 
the  existence  of  the  new  machine  affect  the  design  of  the  original  one?  We 
have  seen  that  the  functica  of  the  new  machine  is  to  find  an  A  which  can 
b  .  predicted  upon  entrance  to  sta’-e  q  of  the  main  machine.  This  then  is  the 
appropriate  place  for  control  to  be  transferred  to  the  submachine:  from 
state  a  special  predictive  transition  will  be  made  to  the  initial  state 
of  the  submachine.  This  transition  basically  leaves  the  original  stack 
unaltered.  Control  is  returned  to  state  q  when  the  submachine  is  finished, 
and  processing  then  continues  normally  by  the  main,  machine  on  its  own  stack. 

It  is  certainly  reasonable  to  design  the  main  machine  and  the  submachine 


so  that  anything  which  is  performed  by  the  submachine  i3  not  also  done  by 
the  main  machine.  In  particular,  since  the  responsibility  of  finding  an  A 
has  been  delegated  to  the  submachine,  state  q  no  longer  has  to  be  concerned 
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with  this.  So  all  A-items  and  their  descendants  can  be  deleted  from  q,  because 
they  are  used  only  to  reduce  some  initial  segment  of  the  inout  to  an  A.  Thus 
the  designer  of  the  machine  can  utilize  the  fact  that  the  task  for  finding 
an  A  in  q  can  be  delegated  to  a  submachine,  by  leaving  out  of  q  certain  items 
that  would  otherwise  be  in  it.  We  can  look  at  this  smaller  version  of  q  as 
really  being  a  new  state  in  the  main  machine.  It  seems  that  the  items  which 
can  be  deleted  from  a  are  precisely  those  items  which  go  into  the  initial 
state  of  the  submachine.  Thus  the  state  q  is  being  split  into  two  states: 
one  which  replaces  q  in  the  original  machine,  and  one  which  is  the  starting 
state  of  the  new  submachine.  The  items  in  these  two  states  comprise  precisely 
the  set  of  items  in  q. 

Obviously,  delation  of  some  items  from q  requires  the  recomputation  of 
the  successors  of  q  in  the  original  machine,  but  this  is  done  in  the  normal 
fashion.  An  example  of  the  entire  process  is  illustrated  in  Figure  3.3.  There 
we  have  constructed  part  of  an  LR(0)  machine  containing  the  states  of  Figure  3.2 
in  which,  as  we  argued  previously,  an  E  could  be  predicted.  We  al3o  show  what 
the  revised  machine  looks  like  after  the  state  has  been  "split".  (Throughout, 
we  have  treated  the  y^  as  though  they  were  single  terminal  symbols;  of  course, 
in  reality  they  could  be  strings  of  terminals  and  nonterminals.)  The  submachine 
to  find  an  E  has  been  constructed;  transition  is  made  to  its  initial  state 
(called  the  predictive  state)  from  a  state  which  occupies  the  place  of  q  in  the 
main  machine  (called  the  base  state  of  the  splitting  of  q).  Note  that  every 
state  in  the  revised  main  machine  and  in  the  submachine  is  a  subset  of  some 
state  in  the  original  machine.  Further  note  that  the  two  diagrams  are 


A  -*  r*.y 


1 


A  -*  xByj^ 


Before  state-splitting 


submachine 


Figure  3.3 
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functionally  equivalent  as  far  as  the  rest  of  the  machine  is  concerned;  that 
is,  entrance  to  q  in  the  original  machine  and  to  the  base  state  in  the  new 
machine  will  produce  Identical  results. 

In  fact,  it  is  easier  to  think  of  the  new  diagram  not  as  being  of  two 
IJl(O)  machines,  one  of  which  can  call  the  other,  but  rather  as  representing 
a  generalized  kind  of  parsing  machine,  in  which  it  is  possible  for  one  state 
to  transfer  control  to  another  part  of  the  machine  and  get  it  back  when  that 
part  has  completed  its  predefined  task.  Each  time  a  call  transition  is  made, 
a  new  stack  is  created  for  the  benefit  of  the  called  submachine.  This  sub¬ 
machine  will  use  this  nev  stack  as  its  parsing  stack,  and  will  destroy  it  when 
control  is  returned  to  the  calling  state.  Since  (as  we  shall  see)  it  is 
possible  for  submachine  calls  to  be  nested  (i.e.,  for  one  submachine  to  call 
another,  or  itself  recursively),  it  is  more  convenient  to  think  of  the  machine 
at  large  as  really  having  a  track  of  stacks:  each  time  a  call  of  a  submachine 
is  made,  a  new  stack  level  is  created,  and  all  processing  utilizes  that  level 
(which  is  itself  a  stack)  until  either  another  call  is  made  or  until  the 
current  call  is  completed  i\nd  a  return  is  effected.  When  a  return  is  made, 
the  topmost,  most  recently  activated  stack  level  is  suspended,  and  processing 
continues  on  the  exposed  lfvel. 

It  should  be  observed  that  the  order  of  recognition  by  such  a  machine 
is  sfilx  strictly  lef'.-to-right ,  bottom-up.  All  the  extra  work  of  making 
predictions  and  fulfilling  them,  of  creating  and  suspending  stack  levels, 
as  of  yet  makes  no  appreciable  difference.  A  comparison  is  made  in  Figure  3.4 
between  a  conventional  LR  parse  and  one  utilizing  this  predictive  ca^ab.  lity. 
Actually,  it  is  a  portion  of  a  parse,  showing  how  the  substring  xy  73^33^ 
would  be  handled  by  the  two  machines  of  Figure  3.3,  assuming  that  the  x  takes 


q0  A  q' 


Kevlsed  parse  with  prediction 


Figure  3.4 
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us  from  some  state  into  state  q^.  (We  do  not  know  or  care  what  is  on  the 
stack  underneath  the  qg,  since  it  is  unaffected  by  this  segment  o'-  the  parse.) 

In  the  second  part  of  the  figure,  each  stack  level  is  written  horizontally; 
laying  the  stack  levels  end  to  end  approximately  reproduces  the  single  stack 
of  the  first  parse.  In  the  revised  parse,  an  E  is  predicted  upon  entry  to 
state  q^;  this  causes  an  E  to  be  written  on  the  old  stack  lev*.  ,  and  a  new 
stack  level  to  be  created,  which  uses  state  q^  as  its  starting  state.  There¬ 
after  parsing  continue...  normally  until  the  POP  state  is  entered;  then  the  top 
ack  level  *s  destroyed,  and  the  machine  resumes  processing  on  the  lower  level, 
transferring  to  the  E  successor  of  the  state  from  which  the  call  had  been  made. 
Note  that  there  are  two  more  steps  i  the  predictive  parse,  to  account  for  the 
prediction  and  fulfillment  steps,  but  otherwise  the  parses  and  stack  config¬ 
urations  are  essentially  the  same.  In  particular,  the  final  states  are 
entered  in  precisely  the  same  order,  thus  preserving  the  order  of  recognition 
In  the  parse. 

We  want  to  emphasize  at  this  point  that  state-splitting  is  not  a  process 
employed  during  the  parse  of  a  string  by  a  machine,  but  daring  the  construction 
of  such  a  machine  by  its  designer.  That  is,  the  designer  notices  that  should 
state  q  be  entered  during  a  parse,  an  A  is  sure  to  be  recognized;  he  realizes 
that  something  can  thus  be  predicted  upon  entry  to  q.  Therefore,  he  "splits" 
q  into  two  states,  and  divides  the  machine  into  a  main  machine  and  a  submachine. 
This  machine  will  operate  in  such  a  way  that  when  the  base  state  of  the 
splitting  of  q  is  entered  during  a  parse,  the  machine  "predicts"  the  recogni¬ 
tion  of  an  A,  and  transfers  to  the  submachine  to  accomplish  this  recognition. 
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3.2  Prediction  with  Lookahead 

There  are  numerous  ramifications  of  these  notions  of  state-splitting  and 
prediction  that  must  be  considered.  First  of  all,  consider  the  IH(1)  state 
cf  Figure  3.5.  Initially,  it  would  seem  Impossible,  upon  entry  to  this  state 
during  the  course  of  a  parse,  to  have  any  advance  infoxmatlon  about  the  non¬ 
terminals  to  be  found  by  this  state-  that  is,  to  be  able  to  make  a  prediction. 
This  is  because  there  are  two  essential  items  in  the  state,  which  have 
different  post-dot  components;  and  these  components  do  not  have  any  common 
leftmost  nonterminal  descendant. 


A  -* 

x.Byjfcij) 

c  -» 

x.Uy2(z2) 

R  -* 

aCyj) 

D  -* 

CM 

>> 

• 

Figure  3.5 

However,  suppose  that  upon  entry  to  this  state  we  allowed  ourselves  to 
examine  the  remaining  input  string  in  order  to  assist  us  in  predicting  the 
future  course  of  the  parse.  In  general,  we  will  allow  ourselves  to  look 
ahead  k-syiabols  intc  the  input.  Now  ...»  this  case,  k  =  1;  and  the  only  two 
possible  1-lookaheads  are  a  and  b.  And  since  Lj(B)  =  (a)  and  L^(D)  =  [b}^ ob¬ 
servation  of  the  lookahead  does  give  us  enough  information  to  effect  a  predic¬ 
tion.  In  particular,  upon  seeing  a,  we  could  predict  B;  and  upon  seeing  b, 
we  coul-'  predict  D.  This  is  bee '■use  the  fixst  symbol  of  the  remaining  input 
string  must  be  generated  from  the  post-dot  section  of  one  of  the  essent'al 
items  of  the  state.  Thus  the  presence  of  an  a  indicaces  the  existence  of  an 
A  in  the  parse  tree,  while  b  does  the  same  for  D. 
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In  examining  this  state  and  designing  the  machine  into  which  it  fits, 
we  can  easily  take  advantage  of  this  analysis.  We  can  "split'  this  state 
as  we  did  earlier,  but  now  make  the  predictive  transition  depend  on  the 
looks1- _ad  being  examined.  Tills  is  illustrated  in  Figure  3.6.  Here  we  have 


A  -*  x.By^z^ 
C  -*  x.Dy2(z2) 
D  -»  .b(y2) 


POP 


Figure  3.6 


shown  how  this  conditional  prediction,  or  prediction  on  lookahead,  could 
be  represented.  The  notation  / a  means  that  if  upon  entry  to  the  base  state 
of  the  splitting,  the  first  symbol  of  the  remaining  input  string  should  be 
an  ai  then  the  machine  would  predict  the  utm.ience  of  B  and  jump  to  the 
predictive  state  shown.  Note  that  the  a  would  not  be  read  onto  the  stack 
by  the  base  state  prior  to  the  prediction  being  made;  rather,  the  a  is 
examined  in  its  place  in  the  input,  and  left  there  to  be  read  later  by  the 
predictive  state.  If  upon  entry  to  the  base  state,  the  lookahead  should  not 
be  a,  then  no  prediction  would  be  made,  and  control  would  remain  with  the 
base  state,  which  would  proceed  to  read  the  next  symbol  onto  its  stack. 

In  the  state-splitting  of  Figure  3.6,  we  have  made  no  explicit  utiliza¬ 
tion  of  the  fact  the  appearance  of  b  as  the  lookahead  symbol  enables  us 
to  predict  the  occurrence  of  D.  We  could  have  designed  an  alternate  state¬ 
splitting,  with  the  predictive  transition  on  /b  and  the  predictive  state 
containing  the  item  P  -*  ,b(y2),  to  account  for  this  case.  Or  we  could  split 


the  original  state  into  three  parts  -  a  base  end  two  predictive  states  -  with 
a  predictive  transition  from  the  former  to  each  of  the  latter.  Ihis  splitting 
is  shown  in  Figure  3.7. 


Figure  3.7 

The  meaning  of  thlj  splitting  is  the  most  intuitive  one.  Should  the 
parser  enter  this  base  state  during  a  parse,  the  next  input  symbol  would  be 
inspected,  and  its  identity  would  determine  which  nonterminal  is  to  be 
predicted  and  to  which  state  to  transfer. 

This  then  serves  as  our  introduction  to  the  notion  of  prediction  on 
’ookahead.  Informally,  we  can  say  that  a  lookahead  u)  indicates  the  presence 
of  nonterminal  A  in  state  q  if  upon  entry  to  q,  the  occurrence  of  u>  as  the 
lookahead  assures  us  that  some  initial  segment  of  the  remaining  input  will 
eventually  be  reduced  to  ai.  A.  That  is,  if  all  ways  to  generating  a  string 
beginning  with  w,  from  the  post-dot  section  of  any  essential  item  in  q,  must 
involve  the  generation  of  an  A  as  the  ancestor  of  some  prefix  of  the  string. 

If  this  is  the  case,  then  we  know  that  at  some  time  in  the  parsj  the  machine 
will  return  to  q  having  found  an  A;  and  so  we  can  consider  instead  delegating 
the  discovery  of  this  A  to  a  submachine.  Let  it  b®  stressed  that  if  a  indicates 
an  A  in  q,  it  does  not  tell  us  that  A  is  the  post-dot  component  of  some 
essential  item  of  q,  nor  give  us  any  specific  information  as  to  the  precise 
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location  of  this  A  in  the  'tree;  we  can  only  be  sure  that  an  A  will  be  there 
somewhere  below  the  post-dot  component  of  ome  essential  item  of  q. 

Before  we  decide  how  and  when  a  nonterminal  is  to  be  predicted  during 
a  parse,  we  shall  look  more  closely  at  the  question  of  deciding  when  to 
indicates  A  in  q.  First  of  all.,  we  shall  adopt  the  following  convention: 
that  only  lookaueads  of  length  k  will  be  used  to  indicate  nonterminals  in 
LR(k)  states.  This  is  purely  for  convenience  and  uniformity,  and  does  not 
impose  any  significant  restrictions  on  our  work.  (If  we  wanted  to  use  a 
lookahead  of  length  k,  with  an  til^J-state,  where  k^  >  k£,  we  could  always 
generate  the  corresponding  LR(k^)state;  while  if  k^  >  k^,  we  can  pad  out 
the  lookahead  to  make  it  of  length  k£. ) 

To  see  if  w  Indicates  A  in  q,  we  should  determine  all  ways  of  generating 
strings  beginning  with  to  from  all  the  essential  items  of  q,  and  then  ascertain 
whether  or  not  A  appears  on  the  leftmost  branch  of  each  generation  tree. 

When  we  say  that  we  must  consider  all  different  ways  of  generating  these 
strings,  we  only  meat!  those  ways  that  differ  along  the  leftmost  branch  of 
their  derivation  trees,  which  is  where  A  should  appear.  If  two  methods  of 
generating  strings  beginning  with  to  are  the  same  along  the  leftmost  brar  h, 
but  differ  in  the  interior  of  the  generation  trees,  then  A  appears  along  the 
leftmost  branch  of  both  or  of  neither;  so  we  need  consider  only  one  of  these 
two  methods. 

That  is,  suppose  C  a.Bp(T)  is  any  essential  item  of  q  and  that 
* 

BpT  =*  p,  pc  VT*  and  to  =  p/k.  Then  we  want  to  show  that  for  every 

derivation  LpT  f  A.x-Pt  f  A~x_x.Pt  f  . . .  -r  A  x  x  ....x~x.Pt 
Lll  L  t-  L  1  L  L  nn  n**  I  <L  l. 
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J  xi_l_lxn*  •  *xjPT  ^  P»  where  x^+^  either  begins  with  a  terminal  symbol  or 
is  f,  that  A^  =  A  for  some  i.  We  know  that  C  -»  a.BP(T)  is  an  item  of  q,  and 
that  B  -»  A^x^  is  a  rule  of  the  grammar.  Then  for  every  such  that  |  t^I  =  k 

•ff 

and  Bt  =»  T^t,  B  -+  .A^x^(t^)  is  also  an  icem  of  q.  In  particular,  we  know 
k 

fhat  BPt  =>  p;  if  we  set  p^  as  the  part  of  p  which  B  generates,  and  p1 1  the 

part  generated  by  pT,  we  can  let  =  p^'/k;  and  then  we  will  have  that  w  f 

FIRST^(AjX^t^).  Similarly,  every  item  A^  -»  is  an  item  of  q,  and  a 

descendant  of  a-*.  A1X1(T1>»  provided  T 2  £  FIRST^fx^).  In  particular,  since 
k 

A,x  T  =*  p  T  ,  we  can  let  t  be  the  first  k  symbols  of  the  section  of  p  T 
ilill  2  li 

that  are  generated  by  x^t^.  can  proceed  or  in  this  way,  and  we  can  be 
sure  that  there  will  be  items  of  the  form  A^ *♦  for  eflch  i,  in  q, 

where  is  the  first  k  symbols  of  p  generated  by  x^  ^...XjpT;  and  for  each  of 
these  items,  to  £  FIRS^^x^T^) .  Finally,  there  will  be  an  icem  A^  -*  -xn+i  ^Tn+i) 

in  q,  with  u  f  FIRSTk^Xtr+lTiH-l^ ’  where  xn+l  *s  p  or  beBins  wlth  a  terminal. 

Thus  we  see  that  for  each  different  derivation  B  =»  A,x  =»  A_x_x,  =*.  ..^  A  x  ...x. 

11  221  nn  l 

=»  x^^.-.x^,  such  that  m  FIRS^Cx^jX^. .  .x^t)  ,  there  is  a  different  sequence 

of  items  of  q,  A^_^  *♦  .AjX^(T  ),  such  that  each  item  in  the  sequence  is  a 

descendant  of  the  previous  one,  and  such  u  c  fIRSTj^AjX^t  )  for  each  i. 

Thus  we  see  that  we  have  all  the  information  readily  available  in  q, which  wa 

need  in  order  to  determine  if  u  indicates  A  in  q.  Namely,  we  will  have  to 

consider  all  sequences  of  items  of  Iq,  Ij,....,In,  sue  i  that  1^  is  an 

essential  item,  I,,,  is  an  imnediate  descendant  of  I.,  and  I  is  A  -* 

1+1  l  an 

•x  , (t  )  where  x  , ,  is  either  f  or  begins  with  a  terminal,  and  such  that 
n+1  n+1  n+l 

W  £  FIRST^(y  T  );  and  then,  with  all  such  sequences  in  hand,  determine 
whether  or  not  there  is  an  A-item  in  the  interior  of  each  of  them.  (There 
may  be  infinitely  many  ?uch  sequences,  but  we  shall  get  around  that  shortly.) 
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Let  ur  observe  at  this  point  that  the  problem  of  the  presence  of  an 
A-item  somewhere  after  the  first  item  in  a  sequence  is  equivalent  to  the 
presence  of  a  .A-item  anywhere  at  all  in  the  sequence,  since  in  such  a 
sequence,  one  item  can  be  a  .A-item  if  and  only  if  the  next  item  is  an 
A-item.  We  are  interested  in  the  presence  of  an  A-item  somewhere  after 
the  essential  item  in  each  sequence  we  are  considering;  it  will  be  more 
convenient  for  us  to  phrase  our  condition  as  a  .A-item  anywhere  in  the 
sequence. 

Observe  that  this  approach,  in  terms  of  these  item  sequences,  works 
even  iu  the  case  where  A  does  not  generate  all  of  u>,  but  only  some  initial 
segment  of  it.  If  we  were  to  predict  an  A  upon  seeing  «  upon  entry  to 
q,  the  submachine  delegated  to  finding  the  A  would  return  before  all  of  w 
had  been  read  onto  the  stack.  Nonetheless,  there  still  would  be  a  .A-item 
in  each  of  these  item  sequences. 

As  an  example,  consider  the  LR(2)  state  of  Figure  3.8;  we  wish  to  know 
if  ab  indicates  the  presence  of  B  in  this  state.  In  order  to  do  this,  we 
must  identify  all  those  items  of  the  state  which  might  give  rise  to  a  lookahead 
of  ah  upon  entrance  to  the  state;  and  then  c( nsider  all  sequences  of  such 
items,  checking  whether  or  not  there  is  3  .B-item  in  each  of  them. 

X  -»  c.B(xy) 

X  -♦  c.A(xz) 

A  "*  .bc(xz) 

B  -*  . CE(xy) 

B  -*  .DF(xy) 

C  "*  .a(bxl 
C  -*  .  a(dx) 

D  . aA(fg) 


Figure  3.8 
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(We  assumed,  in  the  construction  of  this  state,  that  the  rules  for  E  in  the 
grammar  were  E  -*  b  and  E  -*  d;  and  for  F,  F  -*  fg.  ) 

The  items  beginning  with  terminal  symbols,  and  having  ab  among  their 
lookaheads,  are  C  -*  .a(bx),  C  -*  .a(dx),  and  D  -*  .aA(fg).  (In  the  last  case, 
ab*r  L2(aA)  since  A  -*  be  ib  a  rule.)  We  then  consider  all  appropriate 
sequences  of  items,  beginning  with  an  essential  item  and  ending  in  one  of 
these.  There  are  two  such  sequences,  and  they  are  given  in  Figure  3.9. 

X  -*  c.B(xy)  X  -♦  c.B(xy) 

1  .CE(xy)  B-*.DF(xy) 

C  -*  . a(bx)  D  -*  .aA(fg) 

Figure  3.9 

We  see  by  inspection  that  there  is  indeed  a.B-item  in  each  of  these  sequences, 
so  ab  does  indeed  indicate  B  in  this  state.  We  also  see  that  ab  docs  not 
indicate  any  other  nonterminal  in  this  state,  rince  there  is  not  a  .Z-item 
in  each  sequence  for  any  other  Z.  If  we  examined  the  case  for  lookahead  ad, 
we  would  have  just  the  single  sequence  X  -»  c.B(xy),  B  -*  .CE(xy),  C  -♦  .a(dx). 
Then  we  see  that  ad  indicates  the  presence  of  B  in  this  state,  but  it 
also  indicates  the  presence  of  C.  In  general,  a  single  string  may  indicate 
the  presence  of  several  nonterminals  in  a  given  state. 

Suppose  then  that  indicated  the  presence  of  A  in  ate  q;  ho-?  can 
we  use  this  information  to  make  predictions  and  split  the  state  q?  If  we 
decided  to  split  q  Sy  creating  a  predictive  state  for  A,  we  would  divide 
the  items  of  q  between  the  new  predictive  state  and  the  replacement  base 
state  for  q.  Now  the  goal  of  this  predictive  state  (and  the  submachine 
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which  it  heads)  is  to  locate  an  A;  so  those  items  which  will  be  used  to  parse 
to  the  level  of  an  A  belong  in  the  predictive  state.  If  the  lookahead  upon 
entry  to  q  is  w,  then  we  know  that  an  A  will  eventually  bs  found  in  q.  Any 
sequence  of  reductions  made  in  q  defines  a  sequence  of  item..,  each  of  which 
is  a  descendant  of  the  following  one;  in  the  case  of  lookahead  n,  there  will 
be  a  .A-item  in  each  item-sequence  which  q  will  be  going  through.  So  we  can 
,rbreak"  each  of  these  stouences  just  below  the  .A-item;  the  items  after  the 
.A-item  will  go  into  the  predictive  state,  end  will  be  used  there  to  parse 
the  input  up  to  the  level  of  an  A.  After  that,  the  base  state  will  resume 
control  and  use  its  items  to  reduce  the  input  up  to  the  appropriate  essential 
item.  Transfer  between  the  _ase  and  the  predictive  state  will  be  made  on 
lookahead  u,  and  the  return  will  be  effected  once  the  A  has  been  located. 

We  now  have  to  ask  the  crucial  question  -  under  what  circumstances  will 
we  decide  to  split  a  state  to  reflect  the  prediction  of  an  A,  and  how  will  we 
decide  precisely  which  items  are  to  go  into  the  base  and  which  into  the 
predictive  state?  The  above  discussion  showed  us  how,  if  u>  indicates  an  A 
in  q,  to  divide  up  q  to  handle  the  processing  of  scring  beginning  with  u. 

But  when  will  v  avail  ourselves  of  this  technique?  For  the  major  part  of 
our  development,  we  shall  stipulate  that  q  is  to  be  split  by  creating  a 
p-edictive  state  for  A  if  and  only  if  the  following  condition  holds:  if 
is  the  lookahead  upon  entrance  to  q  when  an  A  will  eventually  be  found 
in  q,  then  must  indicate  A in  q.  That  is,  we  will  only  split  a  state  by 
means  of  a  predictive  state  for  A  if  examination  of  the  lookahead  upon  entx'y 
to  q  will  always  enable  us  to  detect  the  presence  of  an  A.  To  put  this 
condition  in  other  terms:  if  B  ■*  a.AP(r)  is  an  item  of  q,  and  c  L^(A(3'r), 
then  w  must  indicate  A  in  q  for  us  to  try  to  split  q  into  an  A-predictive 
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state.  It  is  insufficient  for  A  to  be  indicated  by  some  of  the  lookaheads 
it  generates,  but  not  by  others;  the  lookahead  must  always  be  able  to  tell 
us  whether  or  not  q  is  going  to  find  an  A.  If  this  is  the  case,  then  we 
will  create  a  submachine,  headed  by  an  A-predictive  state,  to  find  this  A,  and 
devise  the  state-splitting  cf  q  accordingly. 

For  example,  consider  again  the  state  of  Figure  3.8.  The  only  lookaheads 
that  can  exist  upon  entry  to  this  state,  if  B  is  to  be  eventually  found  there, 
are  ab  and  ad;  and  as  we  have  seen,  both  of  these  strings  indicate  the  presence 
of  B.  Therefore,  we  can  consider  splitting  this  state  in  such  a  way  that 
B  gets  predicted  if  the  base  is  entered  with  the  lookahead  being  ab  or  ad. 

But  now  consider  the  possibility  of  predicting  a  C.  The  sane  two  strings, 
ab  and  ad,  are  the  only  possible  lookaheads  if  C  is  to  be  found;  but  while 
ad  does  indicate  the  presence  of  C,  ab  does  not,  as  we  have  seen.  Thus  C 
is  not  a  candidate  for  directing  a  splitting  of  that  state.  In  doing  condi¬ 
tional  predictions,  we  insist  that  the  lookahead  a1 ways  tell  us  if  the 
nonterminal  in  question  will  be  found  or  not,  not  just  some  of  the  time.  Urns 
we  do  not  consider  creating  a  predictive  state  for  C  and  jumping  to  it  from 
the  base  when  the  lookahead  is  ad. 

Let  us  suppose  then  that  this  condition  is  met,  that  the  lookahead  always 
indicates  A  in  q,  if  it  is  there.  (As  is  true  for  B  in  the  state  of  Figure  3.8.) 
Precisely  how  is  q  co  be  split?  As  we  discussed  above,  the  predictive  state 
will  bt  transferred  to  from  the  base  state  on  any  lookahead  n  which  indicates 
the  presence  of  an  A,  and  it  will  parse  the  string  until  it  finds  an  A.  Then 
it  will  return  control  to  the  base  state,  which  will  finish  doing  the  job 
that  q  wTas  supposed  to  have  done.  Thus  for  each  such  si,  we  consider  all  the 
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Item-sequences  defined  earlier;  In  each  of  these,  by  hypothesis,  there  will 
be  a  .A-item.  The  items  in  the  sequence  from  the  essential  item  at  the 
beginning,  down  through  the  .A-item,  will  go  into  the  base  s^ate;  while  the 
rest  of  the  sequence,  starting  with  the  A-item,  go  into  the  predictive  state. 

We  create  predictive  transitions  from  the  base  to  the  predictive  state  on 
each  w,  and  a  return  transition  from  the  predictive  state  to  POP  on  A. 

But  there  is  more  that  needs  to  go  into  the  base  state.  In  general 
there  will  be  lookaheads  that  do  not  occassion  an  A  to  be  predicted,  that  will 
not  result  in  an  A  being  found.  Inputs  beginning  with  such  lookaheads  must  be 
processed  in  toto  by  the  base  state,  and  so  all  items  that  may  be  used  in 
such  processing  must  be  left  in  the  base  state.  These  are  all  items  which 
may  be  used  in  processing  any  lookaheads  other  than  the  w's  that  indicate  A; 
that  is,  items  X  a.  YP(t),  where  u) '  c  L^CYPt)  and  w'  dees  not  indicate  A 
in  q. 

The  splitting  of  the  state  of  Figure  3.8  is  shown  in  Figure  3.10.  The 
base  state  is  on  the  left  and  the  predictive  state  on  the  right.  The  items 
in  the  base  state  are  those  in  the  "upper"  part  of  the  item-sequences  used  to 
process  strings  beginning  with  ab  or  ad  (namely  the  item  X  -*  c.B(xy)),  and 
those  used  to  process  strings  that  do  not  bejin  with  either  of  these  lookaheads 
(X  -»  c.A(xz)  and  A  *♦  .bc(xz)).  The  predictive  st^te,  meanwhile,  contains  those 
items  used  until  the  discovery  of  B. 
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/ab,ad 


Figure  3. 10 

It  appears  that  the  items  in  the  predictive  state  for  A  are  just  the 
items  that  form  the  completion  of  A;  i.e.,  all  nonessential  A-items  of  q 
and  all  their  descendants.  T^is  view  is  not  entirely  accurate,  -'s  we  shall 
see  shortly.  However,  one  feature  of  it  is  correct;  namely,  that  the  essential 
items  of  q  alx  remain  in  the  base  state,  and  none  are  included  in  the 
predictive  state. 

In  general,  there  may  be  several  candidates  for  prediction  in  a  given 
state  q,  and  a  different  state-splitting  for  each  choice.  For  example, 
consider  the  LR(2)  state  of  Figure  3.11,  which  is  a  substate  of  the  state 
of  Figure  3.8.  There  are  several  nonterminals  whose  prediction  is  legal 
in  this  state  and  which  can  induce  a  state-splitti ag  of  it.  For  example, 

B  is  indicated  by  the  lookeaheads  ab  and  ad,  as  in  the  earlier  state; 


B  -♦ 

.CE(xy) 

B  -* 

. DF (xy) 

c  -* 

. a(hx) 

c  -* 

. a(dx) 

D  -* 

.  aA(fg) 

POP 


X  -*  c.  B(xy) 
X  -»  c.A(xz) 
A  -»  .bc(xz) 


X  -»  c.  B(xy) 
X  -»  c.A(xz) 
A  -*  .be (xz) 
B  _»  .  CE  (x.,  ) 
C  -*  .a (ox) 

G  -*  .a(dx) 


Figure  3. 11 
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but  on  the  other  hand,  C  Is  also  Indicated  by  these  same  lookahead  strings. 
That  i.,  if  either  of  these  strings  is  sighted  on  entrance  to  the  state, 
both  a  C  and  a  B  will  be  found  by  this  state.  And  finally,  the  lookahead 
be  indicates  an  A;  and  since  it  is  the  only  string  that  can  come  from  an  A, 
this  means  we  could  predict  an  A.  The  alternative  state-splittings  induced 
by  these  different  predictions  are  given  in  Figure  3.12. 


POP  POP  POP 


Figure  3 . 12 


Each  one  of  these  splittings  is  valid,  and  at  this  time  we  shall  not 
attempt  to  assess  their  relative  merits.  A  machine  designer  might  choose 
any  one  of  them;  later  we  shall  describe  some  criteria  for  deciding  what  a 
valuable  or  optimal  splitting  might  be. 

Let  us  examine  the  implications  of  Figure  3.12.  This  indicates  that 
upon  entry  to  q,  on  lookahead  ab  or  ad,  one  could  predict  the  cccurreoce 
of  B,  while  on  lookahead  be ,  an  A  could  be  predicted.  Since  these  two 
"predictive  languages"  are  disjoint,  we  should  be  able  to  combine  these 
pieces  of  information,  and  create  a  splitting  of  q  with  several  predictive 
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states.  Here  the  lookahead  teils  us  not  only  whether  or  not  to  predict,  but 
what  to  predict.  Hie  predictive  and  base  states  are  constructed  by  the  same 
principles  as  before:  the  item- sequences  for  the  A  lookaheads  are  divided 
at  the  .A  item,  one  part  going  into  the  base  state  and  th(.  other  into  the 
A-predictive  state;  and  similarly  for  the  B  lookaheads.  And  those  items 
with  other  lookaheads  aut oca tic ally  remain  in  the  base  state.  Figure  3.13 
shows  a  state-splitting  involving  predictive  states  for  A  and  for  B. 


X  -♦  c.B(xy) 

/ab,ad 

B  -♦  .  CE(xy) 

X  -»  c.A(xz) 

C  .a(bx) 

i 

i  /be 

C  -*  .  a(dx) 

A  -*  .bc(xz) 


T* 

POP 


I 


Figure  3.13 

It  would  not  be  possible  to  split  the  state  of  Figure  3.11  into  a  base 
and  predictive  states  for  B  and  C,  since  the  lookaheads  that  indicate  the 
presence  of  these  two  nonterminals  and  hence  cause  their  prediction,  overlap 
(in  fact  they  are  identical).  It  is  possible  to  have  multiple  predictive 
states,  associated  with  different  nonterminals,  if  and  only  if  the  lookaheads 
associated  with  these  various  predictions  are  mutually  disjoint. 

One  otner  potential  problem  with  our  notion  of  predictive  state-splitting 
needs  to  be  considered.  What  if  A,  the  nonterminal  that  is  to  be  predicted, 
is  left  recursive?  In  that  case,  there  may  be  several  A's  along  the  leftmost 
branch  of  che  cree  above  the  lookahead.  Which  ot  these  is  the  one  that  is 
being  predicted?  How  will  the  submachine  that  is  supposed  to  find  an  A  know 
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/*hen  its  task  is  completed?  Should  it  return  as  soon  as  it  finds  an  A,  or 
keep  r  until  it  finds  a  higher  level  A?  Earlier,  in  describing  how  q 
was  to  be  divided  into  base  and  predictive  states,  we  said  that  each  rele¬ 
vant  item-sequence  of  q  was  to  bt  ''broken"  into  two  segments,  the  dividing 
line  being  given  by  the  .A-itew  in  the  sequence;  but  if  A  is  left  recursive, 
there  may  be  several  .A-ite  in  such  a  sequence.  Where  do  we  choose  to  break 
the  sequence — how  much  should  we  consign  to  the  predictive  and  how  much  to 
the  base?  We  have  said  that  there  is  to  be  an  A- transit ion  from  the  predic¬ 
tive  state  to  the  POP  state;  but  if  there  is  some  .A-item  in  the  predictive 
Jtate,  there  will  have  to  be  some  A-transition  to  a  non-PO?  state  as  well. 

How  is  the  machine  to  know  which  transition  to  follcrw? 

All  these  questions  are  aspects  of  the  same  problem — namely,  if- A  is 
left  recursive,  what  does  it  mean  to  predict  an  A?  We  could  give  an 
arbitrary  answer,  and  say  that  it  means  to  predict  the  first  A  that  will  be 
found  (the  lowest  in  the  tree).  Then  we  would  know  how  to  break  the  item- 
sequences  (at  the  lowest  . A-item),  and  there  would  never  be  any  .A- items 
in  the  predictive  state.  But  this  simple  solution  is  much  too  restrictive 
for  a  variety  of  reasons. 

Rather  than  try  #  *>  proscribe  the  kinds  of  predictions  that  may  be  made, 
ve  put  no  limits  on  it  at  all,  save  the  restriction  tha  the  submachine 
must  be  able  to  determine  whe^  it  has  found  the  A  that  it  has  been  dispatched 
to  find.  What  information  can  i.t  use  to  make  this  determination,  and  what 
test  ^hall  it  perform?  We  stipulate  that  the  sub-machine  can  inspect  the 
lookahead  after  it  has  found  an  A,  and  use  the  nature  or  this  lookahead  to 
determire  whether  or  not  it  has  completed  the  Job  it  was  sent  out  to  do. 
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In  order  for  this  to  be  effective,  there  will  have  to  be  two  classes  of  A's: 
one  is  the  kind  of  A  wh*ch  is  predicted;  and  the  other  is  the  kind  of  A 
which  is  discovered  while  the  predicted  A  is  being  constructed.  These  two 
classes  of  A's  must  have  distinct  fallow  sets,  so  that  the  membership  of  an 
A  found  by  the  submachine,  with  respect  to  these  classes,  can  be  determined 
by  examination  of  the  lookahead  after  the  A  has  been  found.  Ibis  division 
of  A's  into  two  classes  is  not  a  function  of  the  state  q  itself,  but  rather 
of  a  particular  splitting  of  it.  We  are  no  longer  giving  rules  as  to  how 
a  splitting  is  to  be  constructed;  we  are  specifying  an  additional  constraint 
which  any  potential  candidate  for  a  splitting  must  satisfy.  Given  a  tenta¬ 
tive  division  of  q  into  a  base  state  and  an  A-predictive  state,  we  will 
determine  whether  or  not  it  conforms  to  this  other  requirement.  For  a  given 
division  of  q  into  a  base  and  an  A-predictive  state,  the  "predicted  A's" 
are  preciseT  the  post-dot  A's  of  t'^e  base  s*  ate,  while  the  "lever- level  A's" 
are  those  t'.at  are  in  a  post-dot  position  in  the  predictive  state.  The  latter 
A's  ought  to  be  recognized  before  a  transition  is  made  back  to  the  base  state; 
while  the  last  A  to  be  found  by  the  predictive  state,  the  one  that  ought  to 
take  the  machine  into  the  POP  state,  will  tjrn  out,  after  the  return  is  made, 
to  be  one  of  the  post-dot  A's  in  the  base  state.  Thus,  to  determine  if  a 
particular  splitting  is  valid,  we  should  compute  f^,  the  set  of  k-length 
terminal  strings  that  can  follow  post-dot  A's  in  the  base  state;  and  f ^ 
similarly  for  the  A-predictive  state;  and  then  ascertain  whether  or  not 
f i  H  *  4>  .  If  this  is  satisfied,  then  A/f^  will  label  a  transition 

from  the  predictive  state  to  POP,  while  A/fj  goes  to  the  appropriate 
A-successor  of  the  predictive  state. 


For  example,  consider  the  very  3imple  LR(1)  state  of  Figure  3.14.  Since 
a  is  the  only  lookahead  that  comes  from  an  A,  and  a  does  indicate  A  in  this 
state,  it  is  possible  to  split  this  state  into  a  base  and  an  A-predictive  state. 
However  in  trying  to  see  how  the  items  should  be  apportioned  between  these 
two,  we  discover  there  are  several  different  ways  of  doing  it.  For  example, 
let  us  consider  the  item-sequence  B  -*  x.Ab(y),  A  -*  .Ab(b),  A  -*  .Ab(b), 

A  -*  .a(b).  If  we  chose  to  break  this  sequence  after  the  first  .A  item, 

B  -*  x.Ab(y)  will  be  in  the  base  state,  the  other  items  in  the  predictive 
state;  if  instead,  we  chose  to  break  this  sequence  after  the  second  -A  item, 
the  items  B  -*  x.Ab(y)  and  A  -♦  .Ab(b)  would  be  in  the  base  state,  with 
A  -*  . Ab (b )  and  A  -♦  .a(b)  in  the  predictive  state  (there  being  nothing 
a  priori  wrong  with  having  the  same  item  in  both  states);  while  if  we  chose 

B  -*  x.  Ab  (y) 

A  -*  .  Ab  (b  ) 

A  -*  .  a  (b ) 

Figure  3. 14 

to  divide  this  sequence  after  the  last  .A-item  in  it,  this  would  consign 
B  -♦  x.Ab(y)  and  A  -*  .Ab(b)  to  the  base  state,  with  A  -*  .a(b)  in  the  predic¬ 
tive  state.  How  of  course,  to  fully  determine  the  constituents  of  the 
components  of  the  splitting,  we  have  to  break  each  item-sequence.  But  it 
is  not  hard  to  see  that  we  could  make  our  breaking  choices  for  all  other 
sequences  consistent  with  any  one  of  the  three  we  have  just  described  for 
this  sequence;  consistent  in  the  sense  that  the  other  breakings  would  not 
put  any  items  into  the  base  or  the  predictive,  other  than  those  assigned 
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there  by  our  breaking  of  this  particular  sequence.  Thus  we  would  tentatively 
have  three  different  splittings  of  the  state,  each  with  an  A-predictive  state. 
These  three  alternatives  are  shown  in  Figure  3.15.  However,  two  of  these 
divisions  violate  our  principles  of  distinguishing  predicted  A's  from  ancillary 
A's.  Specifically,  in  the  first  two  tentative  splittings,  b  is  in  the  1-follow 
set  of  A's  in  the  base  state,  as  well  as  of  A's  in  the  predictive  state.  Hence, 
only  the  third  alternative,  the  one  without  any  .A  items  in  the  predictive 
state,  is  a  legal  splitting  of  the  state. 


B  ■*  7 .  Ab(y) 

B  x.Ab(y) 

A  -*  .  Ab  (b  ) 

3  -♦  x.Ab(y) 

A  -*  .  Ab  (b  ) 

VTa 

_ _ _ 

A  -♦  .  Ab  ( b  ) 

:  U 

:  /a 

_ * 

A  -♦  .a(b) 

A  -♦  .Ab (b) 

A  -*  .a(b) 

A  -♦  .  a  (b ) 

Figure  3.15 

This  does  not  mean  there  can  be  only  one  legal  splitting  of  a  state  for 
a  given  nonterminal.  The  state  of  Figure  3.  IS  has  two  valid  splittings, 


B  -*  x.Ab(y) 
A  *♦  .  Ac  (b ) 
A  -♦  .  Ac  (c  ) 
A  -*  .  a  (b) 

A  "*  .  a  (c  ) 


Figure  3.16 
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shorn  in  Figure  3.17.  We  also  show  how  the  predictive  states  behave  once  an  A 


A 


A/b 

A/c 


POP 


A/b 


B  -»  x.Ab(y) 

B  *♦  x.Ab(y) 

A  -»  . Ac  ^b) 

1  /a 

A  -»  .Ac(c) 

A  -*  .  Ac  (b) 

:  /a 

t _ 

A  *♦  .  Ac  (c  ) 

A  *♦  .a(b) 

A  .  a  (b ) 

A  -*  .a(c) 

- - - 

A  -*  .a(c) 
- 1 — - 

A/c 


Figure  3.17 

has  been  found.  The  first  one  returns  to  the  base  state  no  matter  what, 
while  the  second  does  so  only  if  the  lookahead  is  b;  otherwise  it  transfers 
to  some  other  state  in  the  submachine.  The  first  splitting  is  designed  so 
that  the  submachine  is  to  return  after  finding  the  first  A  it  encounters; 
the  se.  '.id,  on  the  other  hand,  returns  only  after  climbing  all  the  way  up 
the  chain  cf  left  recursive  A's.  Both  are  legal  splittings,  and  our  results 
will  apply  to  both  of  them. 

This,  then,  is  why  it  is  inappropriate  to  think  of  an  A-predictive  state 
of  a  splitting  of  q  as  containing  the  completion  of  all  the  A-items  of  q. 

This  is  true  when  A  is  not  left  recursive,  and  there  is  only  one  possible 
A-predictive  state.  But  if  A  is  left  recursive,  there  may  be  several  different 
possible  A-predictive  states;  and  the  aie  consisting  of  the  completion  of  q'r 
A-items  may  be  invalid  for  the  kinds  of  reasons  we  have  been  discussing. 

Because  in  general,  there  may  be  several  legal  state-splittings  of  a  given 
state  q,  the  initial  line  of  our  formal  development  will  not  attempt  to  dictate 
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how  a  state  should  be  split;  rather  we  shall  concentrate  on  the  problem  of 
determining  whether  or  not  a  proposed  splitting  of  a  state  q  is  a  well-formed 
splitting.  But  before  we  begin  this,  let  us  see  low  a  split  state  is  incor¬ 
porated  into  the  machine  at  la  how  the  submachine  to  locate  a  predicted 
nonterminal  is  constructed. 

Let  us  3tress  once  again  that  the  fact  that  we  have  been  able  to  split 
a  state,  that  we  are  consequently  able  to  make  a  prediction  during  the  course 
of  a  parse,  should  not  make  any  material  difference  to  the  order  of  this 
parse.  This  is  still  to  be  performed  in  strict  lef t-to-right ,  bottom-up 
order.  Kovzver^  we  shall  be  performing  this  parse  on  a  stack  of  stacks, 
a  new  level  uning  created  by  the  making  of  a  prediction  and  suspended  by 
its  fulfillment.  The  machine  controlling  this  modified  parse  is  to  be  our 
original  machine,  suitably  altered  to  include  these  facilities.  As  we 
mentioned  earlier,  the  immediate  changes  to  the  design  of  the  parsing  machine 
are  straightforward:  the  state  q  is  replaced  by  the  set  of  states  defined 
by  the  splitting,  the  base  and  the  predictive  states.  All  transitions  formerly 
going  into  q  from  other  states  of  the  machine  are  made  to  go  into  the  base 
state;  the  predictive  states  are,  of  course,  linked  by  predictive  transitions 
to  the  base.  Then  the  successors  of  the  base  and  predictive  states  are 
computed  recursively,  until  all  successor  states  have  been  generated. 

Observe  that  a  number  of  states  of  the  original  machine,  which  were  successors 
of  q,  may  no  longer  be  accessible  in  the  new  machine,  and  so  may  be  discarded; 
in  addition,  there  may  possibly  be  new  states  not  in  the  original  machine. 

There  is  one  special  case  to  be  considered,  however.  Suppose  in  the 
process  of  generating  successors  of  a  base  or  predictive  state,  the  state  q, 
which  was  the  state  originally  split,  turns  up.  (This  might  occur  if  q  were 
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its  own  successor  in  the  original  machine.)  Then  we  do  not  proceed  normally; 
the  state  q  is  not  recreated,  but  rather  any  transitions  which  are  scheduled 
to  go  to  its  new  incarnation  are  instead  rerouted  to  go  to  the  base  state 
of  the  original  splitting.  We  shall  provide  an  example  of  this,  plus  its 
motivation,  a  little  late”. 

This  new  machine  is  a  member  of  a  class  of  machines,  which  we  shall 
formally  define  shortly.  The  splitting  procedure  may  be  applied  again  to 
.states  of  this  machine,  resulting  in  yet  another  machine.  We  shall  continue 
this  process  until  we  achieve  a  parsing  machine  whose  structure  satisfies 
certain  properties;  and  then  we  will  be  able  to  utilize  our  predictability 
in  a  new  way.  Before  we  go  into  this,  however,  we  shall  present  a  formal 
development  of  the  preceding  material.  First  we  shall  precisely  define 
the  notion  of  a  state-splitting,  and  then  develop  the  theory  of  multi-stack 
parsing,  proving  that  a  state-splitting  does  not  affect  the  language  recog¬ 
nized  by  a  parsing  machine. 

3 . 3  Formal  Definition  of  State-Splitting  and  Some  Properties 

First  we  need  a  number  of  preliminary  definitions,  some  of  which  we 
have  seen  before.  Throughout  the  following,  q  is  ar.  LR(k)  state,  I  an  LR(k) 
item,  and  A  a  nonterminal. 

Definition  3.1  Suppose  I  is  of  the  form  A  -*  a. P(«).  Then  FIRST^(I)  = 

(x|x  £  V  and  pr  =*  xy  for  some  y) . 

1  L 

That  is,  FIRST^  of  an  item  consists  of  all  terminal  prefixes  of  length 
k  that  can  be  derived  from  the  post-dot  portion  of  the  item. 
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Definition  3.2  A(q)  =  (I  J  I  £  q  and  A  is  the  post-dot  component  of  I], 

Definition  3.3  L.(q,A)  =  IJ  FIRST,  (I). 

I  €  A(q) 

L^(q,A)  is  the  set  of  k-lookaheais  for  the  items  of  q  which  have  A  as 
post-dot  component. 

Definition  3.4  A  chain  through  q  from  Iq  to  I  is  a  sequence  of  items  of  q. 

In,  I,,..., I  ,  such  that  is  an  immediate  descendant  of  I,  for  each  j, 

0*  1’  *  n’  j+1  j 

0  s  j  <  n. 

Definition  3.5  I  is  a  terminal  item  if  I  is  of  the  form  A  -*  Ct.ap(w),  where 
a  f  V^,  or  if  I  is  of  the  form  A  -*  .c(cj). 

v 

Definition  3.6  If  L  C  ,  then  T(q,L)  -  ( I  )  i  is  a  terminal  item  of  q  and 
FIRSTk(I)  fl  L  *  0]. 

That  is,  a  terminal  item  of  q  is  in  T(q,L)  if  and  on*/  if:  some  element 
of  its  k-lookahead  set  is  in  the  set  L. 

Ic 

Definition  3.7  Let  L  c  .  A  chain  through  q  erom  Iq  to  1^  is  an  L-chain 
if  and  only  if  Iq  is  an  essential  item  of  q  and  Ir  £  T(q,L). 

Intuitively,  the  set  of  L-chains  lists  all  possible  ways  of  achieving 
a  lookahead  from  the  set  L  in  state  q.  So  should  a  lookahead  in  L  be  sighted 
upon  entry  to  L,  the  sequence  of  reductions  discovered  by  q  must  follow  one 


of  these  chains. 
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Definition  3.8  Let  R  be  any  set  of  LR(k)  Items.  Then  FOLLOW,.  (A,  R)  = 

kv 

1c 

fx  |  x  f  V„  and  there  is  an  item  B  -*  a.AP(m)  in  R  such  that  f  J  =>  xy  for 

1  Xj 

some  y] . 

That  is,  FOLLOW^ (A, R)  is  the  set  of  k-length  terminal  strings  that  might 
be  a  lookahead  once  a  post-dot  A  has  been  found  in  R. 

We  are  now  in  a  position  to  define  a  legal  state-splitting.  For  pedagogi¬ 
cal  reasons  only,  we  shall  first  develop  the  definition  for  the  case  where  only 
a  single  prediction  is  being  made,  and  then  immediately  generalize  it. 

What  information  is  needed  to  specify  a  splitting  of  state  q?  In  general, 
we  need  to  know  what  nonterminal  is  being  predicted  and  upon  what  lookaheads, 
as  well  as  the  composition  of  the  base  and  predictive  states.  We  have  decided, 
for  the  time  being,  that  the  lookaheads  which  occasion  the  prediction  of  an  A 
must  be  the  full  set  of  lookaheads  that  may  be  generated  from  A  in  q,  i.e. , 
L^(q,A).  Thus,  only  A  (the  predicted  nonterminal),  B  (the  set  of  items 
comprising  the  base  state),  and  P  (the  predictive  state),  must  be  given. 

So  for  a  given  triple  (B,  A,  P),  how  do  we  know  if  they  define  a  legal 
splitting  of  the  state  q?  First  of  all,  we  must  determine  if  the  nonterminal 
A  can  really  be  predicted  in  q;  and  if  it  can  be,  we  must  ascertain  that  the 
division  of  q  into  B  and  P  is  induced  by  this  prediction.  Now  if  A  can  be 
predicted  in  state  q,  the  language  causing  the  prediction  to  be  made  is 
l^(q,A).  In  order  to  be  able  to  safely  predict  an  A  in  q  upon  seeing  a 
lookahead  from  L^(q,A),  it  must  be  the  case  that  every  way  of  generating  such 
a  lookahead  in  q  must  involve  an  A  on  its  leftmost  branch.  Since  the  L^(q,A)- 
chains  through  q  list  all  these  leftmost  brandies,  there  must  be  at  least  one 
.A-item  in  each  of  these  chains.  So  A  can  be  predicted  in  q  iff  there  is  a 
. A-item  in  each  L,  (q,A)  chain. 
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The  acutal  splitting  of  the  state,  the  composition  of  B  and  P,  is  derived 
directly  from  the  positions  of  these  A-items  in  the  chains.  The  predictive 
state  P  is  to  contain,  for  each  of  these  chains,  the  A-item  and  all  items  after 
it  in  the  chain;  while  B  consists  of  the  items  before  the  A-item,  as  well  as 
items  present  in  non-L^(q,A)  chains  (i.e.,  items  contributing  to  lookaheads 
other  than  those  that  cause  tie  prediction  to  be  made).  These  are  the  criteria 
which  B  and  P  must  satisfy  for  the  splitting  to  be  based  on  the  prediction. 

If  A  is  left  recursive,  there  may  be  several  A-items  in  any  given  L^(q,A)  chain. 
In  this  case,  we  can  say  that  the  splitting  is  based  on  the  prediction  if 
every  L^(q,A)  chain  can  be  ''broken"  at  some  A-item  so  that  B  and  P  satisfy  the 
criteria  mentioned  above. 

Even  if  a  state-splitting  reflects  the  valid  prediction  of  a  nonterminal, 
it  may  still  fail  to  be  a  legal  splitting,  if  A  happens  to  be  left  recursive. 
For  in  that  case,  we  require  that  a  "predicted  A"  be  distinguishable  from  a 
"lower  level  A",  by  insisting  that  FOLLOWk(A,B)  H  FOLLOWk(A,P)  =  0.  This 
then  is  an  additional  requirement  that  must  be  satisfied. 

We  summarize  all  this  in  the  following. 

Definition  3.9  Let  q  be  an  LR(k)  state.  Then  a  bipartite  state-splitting 
of  £  is  a  triple  (B,  A,  P)  satisfying  the  following  two  conditions: 

i)  for  each  I^CqjA)  chain  c  =  Iq,...,^.  there  is  a  j,  C  s  j  <n 

such  that  is  an  A-item  and  such  that  if  we  set  H^(c)  = 

{IQ,  1^,.  •  •  ,Ij)  and  H2(c)  =  ( Ij+1»*  •  •  ,1^  ,  then  ?  =  U  H^c} 

and  B  =  U  H,  (c )  +  { I  |  I  £  q  and  FIRSTk(I)  L^q.A)}. 
c 

ii)  FOLLOW^ (A, B)  (1  FOLLOWk(A,P)  =  0. 
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This  rather  cumbersome  definition  is  the  formalization  of  all  the 
preceding;  and  while  it  may  seem  forbidding,  it  is  very  easy  to  use.  There 
are  several  alternative  formulations  of  the  concepts  we  are  developing  here, 
and  we  have  chosen  this  particular  approach  for  two  main  reasons:  it  mirrors 
most  naturally  the  underlying  conceptual  motivation;  and  it  is  amenable  to 
the  kinds  of  generalizations  we  will  be  making  later.  In  fact,  some  of  the 
definitions  we  are  making  and  results  we  are  proving  have  simpler  analogues; 
but  we  will  need  the  more  general  versions  later. 

As  a  brief  digression,  let  us  examine  the  meaning  of  this  definition  if 
k  =  0.  Intuitively,  if  an  LR(0)  state  q  can  be  split  with  a  predictive  state 
for  A,  then  it  should  be  possible  to  predict,  upon  entry  to  q,  the  eventual 
discovery  of  an  A,  regardless  of  what  the  lookahead  is  at  time  of  entry. 

And  this  is  precisely  what  the  definition  does  imply.  If  k  *  0,  L^(q,A)  is 
just  the  set  consisting  of  the  empty  string  e;  therefore,  every  terminal  item 
of  q  is  in  T(q ,L^(q ,A)).  Thus  if  there  is  a  predictive  state  for  A,  there 
is  an  A-item  in  every  chain  through  q  from  an  essential  item  to  a  terminal 
one.  Note  further  if  A  is  left  recursive,  that  there  can  be  no  .A  Items  in 
the  predictive  state,  and  so  the  submachine  will  have  to  return  after  finding 
the  first  A  that  it  encounters.  This  must  be  so,  for  if  there  were  .A  items 
in  P,  the  FOLLOWQ(A,P)  would  be  [(=) ,  and  so  would  intersect  with  FOLLOW0(A,B); 

this  would  violate  the  second  condition  of  (B,  A,  P)  being  a  legal  splitting. 

As  promised,  we  shall  generalize  the  above  definition  to  include  the 
possibility  of  there  being  multiple  predictive  states,  with  each  distinct 
nonterminal  being  predicted  on  its  own  lookahead  set. 
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Definition  3.10  Let  q  be  an  LR(k)  state.  Then  a  state-splitting  of  q 
is  a  pair  (B,Q),  where  Q  is  a  finite  set  of  pairs  (A^P^y,  satisfying: 

i)  L^q.A^  0  L^q.Aj)  -  0  if  i  +  j 

ii)  F0LL0Wk(Ai,B)  D  F0LL0Wk  (,A± ,  )  =  0 

iii)  for  each  Lk(q,Ai)  chain  c  =  Iq,...,]^,  there  is  a  j,  0  £  j  <n, 

such  t*-  c  *-s  an  A^-item  and  such  that  if  we  set  Hj(c)  = 

(I0,...,lj)  and  H2(c)  =  ,1^}  ,  then  =  U  H2(c),  where 

c  ranges  over  all  L^Cq^)  chains,  and  B  -  U  U  H.(e)  + 

i  c 

{I  I  I  £  q  and  FlRSTk(I)  £  I  *  L^CqjA^)),  where  for  each  i,  c 
ranges  over  all  L^Xq.A^)  chains. 

This  is  a  straightforward  extension  of  the  preceding  definition,  with 
the  additional  requirement  that  a  lookahead  causes  at  most  one  prediction. 

Note  that  the  base  state  is  designed  to  pick  up  after  any  one  of  the  predictive 
states  has  completed  its  task,  as  well  as  to  handle  any  string  which  does 
not  occasion  a  prediction  to  be  made. 

We  will  not  devote  much  effort  to  studying  state-splitting  by  itself; 
we  are  more  interested  in  observing  the  effects  on  a  parsing  machine  of  split¬ 
ting  one  of  its  states.  But  one  thing  we  shall  do  with  our  definition  is 
show  it  equivalent  to  a  weaker  form. 

One  problem  with  Definition  3.10  is  that  it  is  expressed  in  terms  of  all 
L^fqjA^)  chains  through  q,  which  in  general  might  be  an  infinite  set.  It  is 
not  clear  that  this  definition  is  effective,  that  there  is  some  finite  algori¬ 
thm  to  determine  whether  or  not  these  potentially  infinite  sets  satisfy  the 
required  properties.  Furthermore,  this  definition  just  tells  us  whether  or 
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not  a  candidate  for  a  state-splitting  is  valid  or  not--it  gives  us  no  clue 
about  finding  a  splitting  for  a  state.  We  shall  solve  both  these  problems 
by  restructuring  the  definition  in  terms  of  a  finite,  distinguished  subset 
of  the  set  of  all  L^Cq.A^)  chains. 

Definition  3.11  A  loop  in  a  is  a  chain  I„  I.  ...  I  such  that  I.  =  I 
-  - 1  ^  0  1  n  On 

ani  no  other  pair  of  items  are  identical.  A  chain  KnK, ...K  contains  the 

loop  IAI-|  •  •  •  I  if  there  is  an  i  such  that  K ,  ,  .  =1.,  for  each  j,  0  £  1  <  n. 
- c.  o  1  n  i+j  j  ’  J 

The  ''hain  contains  r  occurrences  of  the  loop  if  there  are  r  different  values 
for  such  an  i. 

Definition  3.17.  A  chain  is  called  simple  if  it  contains  no  more  than  two 
occurrences  of  any  loop.  A  simple  L-chain  is  an  L-chain  that  is  also  simple. 

We  have  defined  loops  in  such  a  way  that,  for  example,  no  loop  can  be. 
contained  in  another.  The  set  of  simple  chains  is  sufficient  for  considera¬ 
tion  for  purposes  of  state-splitting,  as  the  following  theorem  shows. 

Theorem  3 . 13  The  set  of  state-splittings  defined  by  Definitior  3.10  is 

unchanged  if  the  phrase  "L^(q,A^)  chain"  is  changed  to  "simple  L^Cq^)  chain" 
throughout  the  definition. 

Proof  We  have  to  show  that  (B,  Q)  satisfies  the  revised  definition  if  and 
only  if  it  satisfies  the  original  one.  We  shall  make  use  of  the  following 
result. 

Lemma  3 . 14  Let  c  be  a  chain  I^...In  containing  item  and  1^,  i  <  j.  Then 

there  is  a  simple  chain  J_.  ..J  such  that  J„  =  1„,  J  =  I,  and  for  some  i1 

0  m  J  U  m  n 

and  j',  with  i'  <  j ' ,  <  and  =  J^>. 
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Proof  This  lemma  states  that  for  any  chain  with  distinguished  items  in  a 
specified  order,  there  is  a  simple  chain  with  the  same  endpoints  as  the 
original  chain,  also  containing  the  distinguished  items  in  the  same  order. 
We  will  prove  this  constructively,  by  showing  how  to  construct  this  simple 
chain.  We  shall  remove  all  loops  from  the  original  chain,  except  those  that 
contain  either  of  the  distinguished  items.  The  result  will  be  the  desired 
simple  chain. 


So  given  the  chain  c  =  Iq. ..1^,  with  distinguished  items 
i  <  j,  we  apply  to  c  the  following  procedure. 


Ii  and  Vj, 


Algorithm  3.15 

1)  Set  c '  =  c 

2)  Find  the  leftmost  loop  L  I  ...I  contained  in  c'  (i.e.,  the 

12  m 

loop  with  the  smallest  value  of  k  ),  which  satisfies  one  of 
following  properties : 

i)  k  ii 
m 

ii)  k1  *  j 

iii)  k  ^  i  and  k  <  j,  jut  if  k,  =  i  then  k  i  j 
1  m  1  m 

If  there  is  such  a  loop,  go  to  step  3).  If  there  isn't,  go 

to  step  4). 

3)  If  i  =  k  or  j  =  k  ,  eliminate  he  subchain  I  I  ...I, 

m  m  k.  K.,  k  , 

1  L  m-  i 

from  c';  otherwise,  eliminate  the  subchain  1^  I 

K2  3  ro 


from  c'.  Go  to  step  2). 

4)  This  means  we  are  done.  Renumber  c'  as  ...  J^;  this 

will  be  the  desired  simple  chain. 
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First  we  shall  show  tl  at  this  algorithm  does  in  fact  produce  the  adver¬ 
tised  result.  Observe  that  by  the  way  the  loop  is  chosen  and  the  subchain 
removal  is  eff"  ;ted,  neither  1^  nor  I  is  ever  removed  frou  the  chain  c'. 
Furthermore,  since  each  subcl  ain  removal  eliminates  all  but  one  extremum 
of  a  loop,  the  resulting  sequence  is  still  a  chain,  and  one  in  which  1^  is 
a  predecessor  of  1^.  Finally,  the  endpoints  f  o'  the  same  as  the 

endpoints  of  c,  because  th':  endp'dnts  of  the  chair  could  serve  at  worst  as 
endpoints  of  eliminated  looos ,  and  so  are  not  eliminated  from  the  chain. 

It  r=anains  to  sho*  that  the  final  version  of  c1  is  a  simple  chain. 

(The  algorithm  clearly  does  terminate  since  the  original  c  was  only  a 

finite  chain.)  We  claim  that  any  loop  I  ...I  in  tie  final  version 

K«  K 

1  m 

of  c'  contains  at  least  one  of  1^  and  1^  in  its  interior  or  has  them  both 


as 


the  endpoints.  That  is,  either  k,  i  <  k  or  k  <  j  <  k  or  both  ki--^  i  and 

1  m  1  m 


k^  =  j.  This  follows  because  any  other  relationship  among  them  would  satisfy 
one  of  the  conditions  of  step  2),  and  thu3  cause  the  loop  to  have  been 
eliminated  by  stap  3).  Now  suppose  that  c'  is  not  simple.  Then  it  contains 
more  than  two  instances  of  some  loop.  By  what  we  have  iust  seen,  each 
instance  of  this  loop  must  contain  or  1^  or  both.  Since  there  are  more 
than  two  ‘  stances  of  the  loop,  by  the  pigeon-hole  principle  some  two 
instances  must  both  contain  the  same  item  1^  or  T^,  say  1^.  Thus  these  two 
loop  instances  intersect  on  the  item  1^.  But  by  our  definition  of  loop, 
a  loop  has  no  repetitions  in  its  interior.  Thus  it  is  impossible  for  two 
instances  of  the  same  loop  to  overlap  except  on  tueir  erdpoints  --  i.e. , 
be  end  of  cne  being  the  beginning  of  the  other.  Thus  1^  is  the  end  of  one 
loop  and  the  beginning  >  f  the  other.  But  j.j.  1^  is  the  end  of  a  loop 
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(i.e.,  i  =  k  for  that  loop),  the  loop  would  have  been  eliminated,  since 
it  satisfies  the  first  criterion  of  step  2).  Thus  we  have  reached  a 
contradiction,  and  it  follows  that  c  must  be  simple. 

Intuitively,  what  we  have  done  in  Algorithm  3.15  is  divide  the  chain 
into  three  regions;  see  Figure  3.18.  We  eliminate  entirely  any  loops 
contained  wholly  in  Regions  I  and  III.  These  are  loops  that  do  not  involve 
1^  or  Ij ,  except  possibly  1^  as  an  endpoint  oi.  Ij  as  a  starting  point. 
Similarly,  loops  wholly  in  Region  II  are  also  eliminated,  unless  its  two 
extrema  are  and  1^  (this  qualification  is  to  prevent  either  1^  or  1^ 
from  disappearing  from  the  chain  during  a  loop  elimination).  Thus  the  only 
loops  left  when  the  algorit'-m  terminates  ar  those  which  overlap  two  regions 


Region  I 


T  r0 


Region  II 


Region  II 


Figure  3. 18 


and  each  of  these  will  then  contain  1^  or  1^.  Note  that  throughout  we  are 
concerned  with  the  particular  item  and  location  1^;  the  same  item  as  1^  may 
appear  elsewhere  in  the  chain  c  and  may  be  removed  from  its  other  positions 
by  loop  elimination.  But  the  item  will  remain  in  the  position  held  by  1^. 
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Observe  that  the  items  of  c1  will  not  be  numbered  consecutively  through¬ 
out  the  process  of  removing  subchains;  they  retain  their  original  numbering 
that  they  had  in  c,  and  it  is  only  at  the  last  step  that  they  are  renumbered. 

This  completes  the  proof  and  explanation  of  the  lenma.  Now  ve  want  to 
use  this  lemma  to  prove  Theorem  3.13.  We  will  first  3how  that  if  (B,Q) 
satisfies  the  revised  definition  of  a  state-splitting,  then  it  also  satisfies 
the  original  one.  Now  the  first  two  criteria  of  the  definition  are  not 
affected  by  changing  "I^Cq ,A^ chain"  to  "simple  (q .A^)  chain",  since 
they  have  nothing  to  do  with  chains;  so  if  (B,Q)  satisfies  them  in  the  new 
definition,  it  satisfies  them  in  the  old  definition  as  well.  What  we  must 
show  is  that  if  each  simple  L^(q,A^)  chain  can  be  broken  into  H^(c)  and 
H2(c)  so  that  =  U  H2(c),  whe-e  c  ranges  over  all  simple  l^Cq.A^)  chains, 
and  the  corresponding  statement  is  true  for  B,  then  every  L^(q,A^)  chain 
can  be  broken  into  HjCc)  and  H2(c),  so  that  the  same  equations  hold,  the 
unions  now  being  over  all  L^(q,A^)  chains. 


Definition  3.16  If  (B,Q)  is  a  revised  state-splitting  of  q,  then  a  .A^  item 
in  B  is  called  an  A. -break  candidate. 

i 


Then  given  (B,Q),  a  revised  state-splitting  of  q,  and  c  =  Iq. ..1^,  a 

non-simple  L^(q,A^)  chain,  we  define  H^(c)  =  Iq,Ij,...,Ij  and  H2(c)  = 

{I, I  ,  where  1  is  the  largest  value  for  which  I.  is  an  A, -hreak 
1  j+1  *  n  *  J  °  j  1 

candidate.  (Obviously,  a  revised  state-splitting  Is  one  that  satisfies 
the  definition  containing  the  phrase  "simple  L^(q,A^)  chains". )  We  are 
trying  to  use  the  breakings  of  simple  chains  implied  by  the  Identity  of  B 
and  P^,  as  models  for  breaking  the  non-simple  chains  as  well. 


■  •-*■****#*&* 
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First  we  must  show  that  this  definition  of  and  H2  is  meaningful,  that 
every  non  simple  L^CqjA^)  chain  contains  at  least  one  A^-break  candidate. 
Starting  with  a  non-simple  L^(q,A^)  chain  c,  we  may  repeatedly  remove  loops 
until  we  are  left  with  a  loop-free  L^(q,Ai>  chain,  c1  =  Since 

(B,Q)  is  a  revised  state-splitting,  H^c')  and  H2(c’>  are  defined,  because 
a  loop- free  chain  is  also  simple.  Say  H^(c')  -  fjQ,...,J^}  and  H2(c')  = 

{ J » • • •  , Jm} .  by  the  definition,  is  an  A^-item  which  is  an  immediate 

descendant  of  J^.  Therefore  is  an  ,A^  item.  But  p  H^(c')  c  B.  Thus 
is  an  A^-break  candidate.  Since  c1  is  contained  within  c,  is  an  item 


of  c,  and  so  c  has  at  least  one  A^-break  candidate. 


Now  we  must  show  that  and  H2,  as  they  are  now  defined  on  all  L^Cq.A^) 

chains,  do  indeed  make  (B,Q)  a  legal  state-splitting  of  q  according  to  the 


original  definition 

We  introduce  the  following  notation:  Assume  q,  k,  and  a  set  of  A^'s  are 
given;  then  R  =  { I  |  I  p  q  ar.d  FIRSTk(I)  <£  UL^q.A^);  while  for  each  i, 

FC^  is  the  set  of  all  L^q.A^)  chains  and  SC^  is  the  set  of  all  simple 
( q , A^ )  chains. 

What  we  must  show  then  that  for  the  extended  and  H2,  and  for  (B,Q) 

as  given,  P,  =  U  H  (c)  and  B  =  1J  U  H  (c)  +  R.  Since  (B,Q) 
c  (  FCj  i  c  £  FC^ 

is  known  to  be  a  revised  state-splitting,  we  know  tnat  P  =  U  H2(c) 

cfSCj 

and  similarly  for  B.  What  must  be  shown  then  is  that,  for  each  i, 

U  H  (c)  =  IJ  H  (c)  and  U  H  (c)  =  U 
c  £  FC£  c  £  SC^  c  €  PCt  c  £  SC£ 


H^c).  We  will 
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prove  the  latter  of  these;  precisely  the  same  argument  and  construction  suffice 
for  the  former  as  veil. 

Since  SC.  FC, ,  we  need  only  she?  _U  H,  (c)  c  H,  (c).  Let  c  = 

tCj  1  SC^  1 

Iq.,.1^  be  any  element  of  FC^,  with  H^c)  =  ( IQ> . . . ,  }  and  H2(c)  = 

( Ij+1». . . ,In] •  Pick  any  element  of  H^(c),  namely  1^,  with  0  s  k  s  j. 

We  will  find  a  simple  chain  c',  an  element  of  SC^,  such  that  1^  £  H^(c'). 

We  do  this  by  applying  Lemma  3.14  to  the  chain,  using  as  the  distinguished 

items  1^  and  1^.  Thus  by  the  Lemma  we  have  a  simple  chain  with  the  same 

endpoints  as  c,  also  containing  1^  as  a  predecessor  of  1^.  Since  the  simple 

chain  ends  at  a  T(q,L^(q,A^))  item,  it  is  a  simple  L^(q,A^)  chain.  It 

only  remains  to  show  that  1^  £  H^(c').  Now  we  know  that  1^  £  H^(c);  in  fact, 

1^  was  the  last  A^-break  candidate  in  c — that  is  how  H^(c)  and  H2(c)  got  to 

be  defined  as  they  were-  But  if  Ij  is  an  A^-break  candidate,  that  means  that 

1^  £  B,  by  definition  of  A^-break  candidate.  So  suppose  1^  t-  H^(c'),  Then 

1^  €  H2(c');  and  since  1^  follows  1^  in  c ' ,  that  means  1^  f  H2(c')  also. 

But  c'  is  a  simple  L^Cq^)  chain.  Therefore,  H^c')  Cl  P, ,  since  (B,Q)  is  a 

revised  state-splitting  by  hypothesis.  Thus  1^  f  B  and  1^  £  P^.  But  1^  is 

a  .A^  item;  that  is,  A^  is  the  post-dot  component  of  1^.  Therefore,  1^ 

contributes  to  both  FOLLOW^CA^ ,B  )  and  FOLLOW^ (A^ , ) ;  therefore,  FOLLOW^CA  ,B) 

0  FOLLOW^  (A^,P^)  <t>.  But  this  contradicts  the  second  criterion  of  (B,Q) 

being  a  legal  revised  state-splitting.  Thus  we  have  reached  a  contradiction. 

Therefore  I  £  H.(c'),  and  U  H.(c)  c  U  H .(c),  which  is  what  we  wanted 

k  1  FCi  1  SCi 

to  p^ove.  Precisely  the  same  technique  can  be  used  to  show  U  H2(c) 

FCi 

U  H_(c).  Thus  any  state-splitting  satisfying  the  revised  definition  also 
SCi  ' 

satisfies  the  original  one. 
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To  get  some  feeling  for  the  kind  of  proof  we  have  just  used,  and  which 
we  shall  use  again,  refer  to  Figure  3.19.  We  see  bow  c  is  divided  into 
H^c)  and  H2(c),  and  how  c'  might  be  broken  if  /  H^c  ).  Since  1^  is 
at  the  end  of  H^(c),  it  must  be  abreak-pcint  ,  and  so  must  be  in  B;  while 
the  potential  division  of  c'  would  put  Ij  in  ?^.  And  this  is  impossible, 
as  we  have  seen. 


last 

breakpoint 
in  c 


h 


jk- 


H2(c’) 


J 


Figure  3.19 


To  complete  the  proof  of  the  theorem,  we  must  assume  that  (B,Q)  is  a 
legal  state-splitting  by  the  original  definition,  and  then  show  that  it  also 
satisfies  the  revised  definition.  We  will  not  have  to  define  a  new  and 
H2  to  use  in  the  revised  definition,  but  cai  use  the  restriction  of  the 
and  H2  of  the  original  definiton  to  the  simple  L^(q,A^)  chains.  Once  again, 
the  proof  reduces  to  showing  that  U  H^(c)  c  U  H^(c)  and  that 


FC, 


SC. 


1  I  H_(c)  C  I  I  H  (c).  And  to  show  this,  we  use  exactly  the  same  techniques 
FC  1  SCj 

as  we  did  before.  We  select  any  L^(q,V^)  chain  c,  let  1^  be  the  last  item 
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in  H^Cc),  and  let  1^  be  any  element  of  H^c).  Then  we  construct  the  corres¬ 
ponding  simple  chain  c'  containing  both  and  I  ^ ,  and  argue  that  if  1^  £ 
H^(c'),  then  I  ^ ,  a  .  item,  would  be  in  both  B  and  P^,  which  is  impossible. 

It  is  almost  identical  to  the  first  proof.  Q.E.D. 

This  completes  the  proof  of  the  theorem.  In  order  to  make  use  of  it, 
we  should  establish  the  following  result  which,  while  intuitively  obvious, 
deserves  proof  because  of  our  unusual  definition  of  loop. 

Proposition  3.17  For  any  LR(k)  state  q  and  any  L  c  v  ,  there  are  only 
finitely  many  simple  L-chains  through  q. 

Proof  If  there  were  infinitely  many  simple  L-chalns ,  then  for  any  number  n 
there  would  be  some  simple  L -chain  with  at  least  n  repetitions  of  the  same 
item.  Let  m  be  the  number  of  different  loops  that  can  be  formed  using  items 
in  q;  since  there  are  only  finitely  many  items  in  q,  and  each  loop  allows  for 
repetition  only  at  the  ends,  this  will  be  a  finite  number.  Let  n  =  3m,  and 
choose  some  simple  L-chain  c  that  has  at  least  3m  instances  of  the  same 
item.  Every  successive  pair  of  these  items  defines  a  subchain  of  the 
main  chain  which  starts  and  ends  with  the  same  Item.  Each  of  these  subchains 
must  either  itself  be  a  loop  or  contain  a  loop.  Thus  there  are  at  least 
3m-l  loops  in  the  chain  c.  Since  there  are  only  m  different  loops  that  can 
be  formed  from  items  of  q,  there  must  be  some  loop  which  has  at  least  three 
occurrences  in  c.  But  this  means  c  is  not  simple,  which  is  a  contradiction. 
Therefore,  there  are  only  finitely  many  simple  L-chains  through  q.  Q.E.D. 


74 


(As  a  brief  aside,  we  note  that  the  seemingly  peculiar  choice  of  our 
definition  for  simple  chains  was  motivated  by  a  desire  to  make  Lerana  3.14 
be  valid.  The  more  intutive  choices  for  the  definition  (namely  a  chain 
with  no  loops,  or  a  chain  with  at  most  one  instance  of  any  single  loop) 
would  not  satisfy  thi3  requirement.  Some  items  of  a  state  may  not  appear 
in  any  loop-free  chains  through  the  state;  and  while  one  particular  item 
may  precede  another  in  an  unrestricted  chain,  this  may  no  longer  be  the  case 
when  we  consider  only  chains  with  at  roost  one  instance  of  any  loop.  ) 

Not  only  is  the  set  of  simple  L-chains  through  q  a  finite  set,  it  is  a 
set  that  can  be  effectively  computed.  Recall  that  a  simp'.e  L-chain  is  a 
chain  that  ends  with  an  item  in  T(q,L);  i.e.,  a  terminal  item  I  such  that 
FIRST^(I)  D  L  ^  0.  It  is  well  known  that  FIRST^(I)  can  be  computed  for  any 
item  I  of  an  LR(k)  grammar;  thus,  since  both  q  and  L  are  finite,  the  elements 
of  T(q,L)  can  be  computed.  If  there  are  p  different  items  in  the  state  q, 
and  m  is  the  number  of  different  loops  in  q,  then  i  is  clear  that  the  longest 
simple  chain  in  q  must  be  shorter  than  3mp;  for  if  there  were  a  simple  chain 
of  length  3mp  or  greater,  there  would  be  at  least  3m  instances  of  some 
particular  item  in  the  chain,  which  we  have  just  seen  to  be  impossible.  Thus 
:.t  is  clear  that  we  can  compute  the  full  set  of  simple  chains  in  q,  and 
hence  the  set  of  simple  L-chains  through  q,  for  any  given  L.  For  any  given 
nonterminal  A^,  L^(q,A^)  is  also  ccmputaHe;  therefore  we  can  effectively 
construct  the  set  of  simple  L^(q,A^)  chains. 

So  suppose  we  are  given  an  LR(k)  state  q  and  (B,Q),  a  candidate  for  a 
splitting  of  q;  we  can  then  effectively  determine  whether  or  not  (B,Q) 
is  really  a  splitting  of  q  as  follows.  Since  I^.(q,A^X  FOLLOW^ (A^ ,B) ,  and 
FOLLOW.K(Ai,F^)  are  computable  for  each  i,  we  can  check  the  first  two 


conditions  of  the  definition.  Then  it  only  remains  to  see  vthether  or  not 
Hj  and  can  be  defined  on  the  simple  L^(q,A^)  chains  in  such  a  way  that 
the  defining  equations  for  B  and  P  are  satisfied.  But  there  are  only 
finitely  many  A^'s,  and  for  each  one,  the  set  of  simple  L^(q,A^)  chains  is 
finite  and  computable;  thus  there  are  only  finitely  many  different  possible 
definitions  for  and  H7,  which  wo  can  try  in  turn.  For  each  definition, 
the  splitting  induced  by  it  can  be  examined  and  compared  with  (B,Q).  If 
(B,Q)  does  natch  one  of  them,  then  it  is  indeed  a  legal  splitting;  otherwise 
it  is  not. 

Thus  we  have  established  the  following  result. 

Corollary  3.18  Given  an  LR(k)  state  q  and  a  pair  (B,Q),  it  is  decidable 
whether  or  not  (B,Q)  is  a  splitting  of  q. 

We  can  go  one  further  step.  For  a  given  q,  there  are  only  a  finite 
number  of  pairs  (B,Q),  where  B  C  q  and  Q  is  a  finite  set  of  pairs  (A^,P^), 
where  A^  is  a  nonterminal  and  C  q.  All  these  pairs  (B,Q)  can  be 
effectively  generated,  and  each  can  be  tested  in  turn  as  to  whether  or  not 
it  is  a  splitting  of  q.  Therefore  we  have: 

Corollary  3.19  For  any  LR(k)  state  q,  it  is  decidable  whether  or  not  there 
is  a  splitting  of  q. 

Obviously,  both  these  procedure  are  ridicuously  inefficient,  but  they 
suffice  to  show  that  state-splitting  is  an  effective  and  computable  notion. 
Later  we  sha1!  discuss  more  efficient  algorithms  for  deciding  these  issues. 
Actually,  we  shall  rarely  be  interested  in  determining  whether  some  arbitrary 
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splitting  is  valid  or  not,  or  whether  there  is  some  splitting  of  a  state  q; 
rather,  we  shall  need  to  know  if  there  is  some  splitting  of  q  that  satisfies 
certain  additional  constraints,  and  these  constraints  will  help  us  resolve 
this  question.  For  the  time  being,  we  shall  just  establish  a  few  general 
results  about  state-splittings,  to  give  us  some  further  feeling  for  them  and 
some  intuition  as  to  what  might  be  legal  or  not. 

The  first  thing  we  should  establish  is  that  the  parts  of  a  splitting 
add  up  to  the  state  being  split.  This  will  entail  proving  a  variety  of 
interesting  intermediate  results. 

Definition  3.20  If  (B,Q)  is  a  splitting  of  q,  then  B  and  each  P^  is  a 
component  of  the  splitting. 


Lemma  3.21  If  the  item  A  -*  d.P(w)  is  an  ancestor  of  the  item  B  -»  .y(t), 
* 

then  fio  =»  yfy,  for  some  y  €  VT*» 


Proof  If  A  -*a.p(w)  is  an  ancestor  of  B  -*  .y(r),  then  there  is  a  sequence 
of  items  Iq,I^,.  . .  ,1^,  where  In  =  A  -»  a.P(w),  ln  =  B  -»  .v(T),  and  I^x1  is 


i+1 


an  immediate  descendant  of  1^.  We  proceed  by  induction  on  the  length  of 

this  sequence.  If  n  =  0,  the  statement  is  trivial.  So  say  it  is  true  for 

n  =  j;  to  show  it  true  for  n  =  j+1.  Consider  the  item  1^  in  the  sequence 

which  shows  B  -*  .y(t)  is  a  descendant  of  A  -*  a.  P(w).  Since  B  -*  ,y(r)  is 

^j+1’  *'^e  Post”^ot  component  of  1^  must  be  B.  Let  1^  be  C  -»  .By^(T^). 

* 

By  induction,  sine*:  1^  is  a  descendant  of  Ig,  pw  =>  B^T^y,  ^or  some  Y* 

But  since  I  ^  is  an  immediate  descendant  of  I  ^ ,  we  nave  t  f  FIRGT^ (y^i ^) ; 

*  *  * 
this  means  Y-^T^  Ty^»  for  some  y^.  Hence  P^  °  By^T^y  ^  YYjTjy  3  Y^^Y* 

* 

If  we  let  y'  =  y^y,  c^en  ^  YTy'»  and  we  are  done.  Q.E.D. 
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Lenina  3.22  If  item  1^  is  an  ancestor  of  item  I^,,  then  FIRST^C^)  c 
FIRST^^). 

Proof  Suppose  ^  in  A  -»  a.0(w)  and  I2  is  C  -*  .y(T).  If  P  f  rlRST^tt^, 
then  yT  =>  Px  for  some  x  c  V  *.  By  Lemma  3.21,  (L)  =*  yTy ,  for  some  v  c  V  *. 

Li  X  X 

Thus  pJ  =>  yTy  13  Px y,  for  some  x  and  y  in  V^,*.  Therefore  0w  =»  pz,  for  some 

Ir  ^ 

z  f  V^,*.  Since  p  £  ,  pz  f  V^*;  and  it  is  well  known  that  if  7  3 

tAt  "Jc 

p  z  f  VT*,  then  Y  |»  p  z.  Therefore  p  ^  Pz,  so  p  f  FIRST^ (p^) ,  so 
p  €  FIRSTjUJ.  Q.E.D. 

Lemma  3.23  If  u  p  FIRST^Cl),  then  there  is  a  chain  of  items  I  =  1^,1^,... 

I  ,  such  that  I  is  a  terminal  item  and  w  c  I  . 
n  n  n 

Proof  If  I  is  a  terminal  item,  then  we  are  done.  So  assume  that  I  is 

"h  'fc 

A  -*  a.BP(T);  then  wx,  some  x  f  V  *.  Then  B  =»  m  ,  some  prefix  of  ux. 

In  this  derivation,  there  is  some  first  application  of  a  rule  whose  right- 

hand  side  is  f  or  starts  with  a  terminal  symbol.  Let  this  be  the  ntb  step 

of  the  derivation,  and  let  p^  be  the  rule  applied  at  the  itb  step,  i  ^  n. 

* 

For  uniformity,  say  p.  is  B.  -*  B  .cp. ,  while  p  is  B  ■+  cp  .  Each  B  r*  w 

x  1.  It xx  fl  n  n  x  L  l 

where  u  ls  some  prefix  of  wx;  define  p^  such  that  =  ux,  and  let 

=  p^/k.  Let  the  item  1^  be  B^ 

The  item  I  ,  B  ■*  (t  ),  is  a  terminal  item,  by  definition  of  n. 
n’  n  n  n 

Furthermore,  cp  t  =>  w  t  ,  which  is  a  prefix  of  wx  of  length  at  least  k; 
n  n  L  n  n’ 

hence  <*>  f,  FIRSTk(In).  All  that  remains  is  to  show  that  £  FIRST^Ccp^r^) ; 

this  will  establish  that  I  ^  is  an  immediate  descendant  of  I^.  We  know 

ie  ★ 

that  BiP,  j*  Bi+icpipi  l  Wi+lCpiPi  L  U>x;  but  by  definition»  Wi+iPi+i  =  Ux* 
Hence  cp^  ^  0  therefore  =  Pi+1/k  C  FIRST^cp^).  Q.E.D. 
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Lemma  3.24  If  I  is  any  item,  then  FIF.STk(I)  =  U  FIRSTk(I ’ ) ,  for  all  I' 
which  are  Immediate  descendants  of  I,  not  equal  to  I. 


Proof  By  Lemma  3.22,  containment  in  one  direction  is  immediate.  If 

u  f  FIRST.  (I),  th  ;u  there  i*  a  chain  I  =  IA,I, ,...,I  ,  where  w  f  FIRST.  (I  ) 
k  ’  0*  1*  ’  n*  kn 


and  I  is  a  terminal  item.  Let  I.  be  the  first  item  in  this  chain  which  is 
n  j 

not  equal  to  I.  Then  by  Lemma  3.22,  since  1^  is  an  ancestor  of  1^,  «  £ 
FIRSTk(Ij).  Furthermore,  I  =  Ij_^.by  definition  of  j,  so  1^  is  an  immediate 


descendant  of  1. 


Q.E.D. 


In  exactly  the  same  way,  we  can  get  the  following  characterizations. 

Lemma  3.25  FTR5Tk(l)  =  IJ  FIRSTk(I' ) ,  for  all  I*  which  are  descendants  of 
I,  not  equal  to  I. 

Ijmwna  8.26  FIRSTk(l)  -  U  FIRST^I' ) ,  for  all  I'  which  are  terminal  items 
and  descendants  of  I. 


These  results  tells  us  that  it  is  possible  to  recursively  compute  FIRSTk, 
by  starting  with  FIRSTk  of  terminal  items. 

Lemma  3.27  Let  ^B,Q)  be  a  splitting  of  q,  I  an  item  of  q.  If  FIR8Vk(l)  0 

T1c 

Proof  Since  I  is  an  item  of  q,  it  is  a  descendant  of  some  essential  item  of 
q.  Suppose  u  f  FIRSTk(I)  H  ^(q^):  chen  by  Lemma  3.23,  I  is  the  ancestor 
of  some  terminal  item  I',  such  that  u  £  FIRSTk<I').  Join  these  two  chains, 
the  one  from  an  essential  item  to  I,  and  the  one  from  I  to  I',  and  the 
result  is  a  chain  through  q  to  I',  that  contains  I.  By  Lemma  3.14,  we  can 


(q,At)  ^  <&,  for  some  i,  then  I  is  an  element  of  some  simple  I^Cq.A^)  chain. 
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construct  from  this  chain  a  simple  chain  through  q  to  I',  which  also  contains 
I.  Since  I'  is  an  L^CqjA^)  terminal  item  by  definition,  this  chain  is  a 
simple  L^(q,A^)  chain.  Q.E.D. 

Lemma  3,28  Let  (3,Q)  be  a  splitting  of  q;  then  every  item  of  q  appears  in 
at  least  one  component  of  the  splitting. 

Proof  Let  I  be  any  item  of  q.  If  FIRST^(I)  U  L^(q,A^)  then  I  €  B  by  the 
defining  equation  for  B.  If  FIRST^Xl;  L.K(q,A_^)  ^  0  for  some  i,  then  by 
Lemma  3.27,  I  is  contained  in  some  simple  L^(q,A^)  chain  c;  then  by  definition, 
and  ^  are  both  defined  on  c,  so  I  £  H^(c)  or  I  f  HL^c).  But  H^(c)  c  B 
and  ^(c)  c  P^;  hence  I  f  B  or  I  f  P^.  Q.E.D. 

Corollary  3.29  If  (B,Q)  is  a  splitting  of  q,  chen  q  equals  the  union  of 
all  the  components  of  the  splitting. 

Proof  By  Lenina  3.28  and  the  fact  that  each  mponent  consists  of  items  of  q. 

Q.E.D. 

Proposition  3.30  If  (B,Q)  is  a  splitting  of  q,  and  if  I  €  T(q,L^(q,A^))  for 
some  i,  then  FIRST^(I)  c  L^(q,A^). 

Proof  Intuitively,  what  this  says  is  that  if  some  k  -lookahead  of  the  Item 
I  of  state  q  triggers  the  prediction  of  an  A^,  than  every  lookahead  of  that 
item  does  cause  a  prediction  of  an  A^.  The  proof  is  as  follows .  Since  I  £  q, 
there  is  a  chain  to  I  from  some  essential  item  of  q,  and  hence  some  simple 
chain.  Since  I  £  TCq.L^CqjA^))  (that  is,  since  FIRST^II)  r  L^(q,A^)  ^  0), 
by  definition  any  simple  chain  ending  in  I  is  an  L^(q,A^)  chain.  Since  (B,Q) 
is  a  splitting  of  q,  tl ere  is  an  .A.  item  in  each  such  chain.  Hence  I  is 
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the  descendant  of  some  .A,  item,  I'.  Therefore  FIRST,  (I)  c  FIRST,  (I'),  by 

*  K  K 

Lemma  3.22.  Since  by  definition,  L^(q,A^)  is  the  union  of  FIRST^  for  all  .A^ 
items,  we  have  FIRST^I)  FIRSl^Cl')  <=  LjJq.A.,).  Q.E.D. 

Theorem  3.31  Let  q  be  an  LR(k)  state,  (B,Q)  a  splitting  of  q,  and  I  an 
item  of  q  of  the  form  A  Ct.  ap(T),  where  c  p  v.^  U  V  and  c  is  not  a  left” 
recursive  nonterminal.  Then  I  appears  in  only  one  component  of  the  splitting 
(B,Q). 


Proof  Suppose  I  £  P^  and  I  a  Py  Then  by  definition,  I  is  a  descendant 

of  some  .A,  item  and  a  descendant  of  some  .A.  item.  Therefore  FIRST,  (I)  c: 
1  j  k 

and  FIRSTk(I)  C  I^  (q,Aj);  therefore  L^.A^)  ^  ^(q^)  ^  0. 

But  this  contiadicts  the  definition  of  a  state-splitting,  since  all  the 
predictive  languages  must  be  disjoint. 


Now  suppose  I  c  B  and  I  f  P^.  Since  I  C  Pj ,  I  is  ^  descendant  of  a 
.Aj  item,  and  so  FIRST^d)  c  L^Cq.A^).  By  definition,  B  is  the  union  of 
the  "upper  half"  of  all  simple  L^(q jA^J-chains ,  and  those  items  I*  such 
that  FIRSTk(I')  is  not  contained  in  the  un^n  of  all  the  L^CqjA^).  Since 
I  f  B,  T  mist  fall  into  one  of  these  sets.  Since  FIRST^Cl)  c  ^(q.A^) 
by  Proposition  3.30 >  I  does  not  fall  into  the  latter  category.  Then  I 


must  be  above  a  .A^  item  in  some  simpleL^Cq ,A^)  cha*n,f0r  some  i.  If  this  i 

i3  not  equal  to  j,  then  we  get  that  FIRST^CI)  D  L^Cq^)  ^  0;  but  since  FIRST^ 

(I)  C  l^Cq,  A^),  this  means  that  l^Cq,  H  1^  (q,  A^)  ^  0,  which  is  a  contra- 

diction.  Thus  I  must  be  above  a  .A  item  in  some  simple  L.  (q,A.)  chain,  in  or- 

J  k  J 

oer  for  it  to  be  i.i  B.  Say  che  post-dot  element  of  I  Is  C.  Since  I  is  above 

•fa 

some  .A;  item,  we  have  C  »  A.x  foi  some  x;  since  I  is  below  some  .A.  item,  we  have 
j  J 

A  =*  Cy  for  some  y.  Thus  C,  the  post-doi  component  of  I,  is  left  recursive.  This 


contradicts  our  hypothesis,  and  so  we  are  done. 


Q.E.D. 
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Theorem  3.32  Let  q  be  an  LR(k)  state,  (B,Q)  a  splitting  of  q,  and  I  any 
item  of  q.  Then  I  is  in  at  most  :me  predi-tiva  state  of  the  splitting. 

roof  The  first  h_lf  tl.®  preceding  proof  works  for  any  item  at  all. 

Q.  E.  D. 

Corollary  3.33  Any  terminal  item  of  q  appears  in  exactly  one  component 
of  the  splitting  fB.Q). 

This  result  follows  both  from  Theorem  3.31  or  from  Proposition  3.30. 

Theorem  3.34  Let  (B,Q)  be  a  splitting  of  q.  If  nonterminal  is  not  left 
recursi/e,  then  P4  is  the  completion  of  all  the  A^. -items  in  q. 

Proof  First  of  all,  if  I  p  P^,  then  I  is  in  »2(c)  for  tome  simple  I^Cq,.*.) 
chain;  hence  I  is  f  descendant  of  some  .A^  item  (the  one  in  the.  chain)  ,  and 
thus  in  the  completion  of  the  A^  items.  On  tne  other  hand,  if  I  is  an  item 
in  this  completion,  tnen  I  is  a  desc  ad  int  of  some  A^-item. 

Therefore,  by  Lemma  3.1A, 

r  „8  a  follower  of  this  A^-item  in  some  simple  chain  through  q.  By  definition, 
this  chain  will  be  a  simple  L^(i,A^)  chain.  Therefore,  this  chain  must  be 
breakable  at  some  A^-item  into  and  Since  Aj  is  not  left-recursive, 

there  is  only  one  A^-item  in  this  chain,  since  I  follows  it  in  the  chain,  1 
will  be  in  1^  of  the  chain,  and  therefore  in  P  by  definition.  Q.E. P. 

Ip  summary  then,  a  state  splitting  distributes  all  the  items  of  a 
state  among  the  components  of  the  splitting.  Any  particulir  item  occurs 
in  cnly  one  component  unless  its  post-dot  symbol  is  a  left- recursive 
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nonterminal;  in  that  case,  the  item  may  appear  in  the  base  state  and  also 
in  the  predictive  state  for  A^,  where  and  the  post-dot  nonterminal  of 
the  item  in  question  are  mutually  left-recursive.  Similarly,  the  contents 
of  the  predictive  state  for  A^  are  determined  to  be  the  completion  of  the 
A^-items  in  the  state  being  split,  unless  A^  is  left  recursive  in  this 
case,  the  predictive  state  is  a  subset  of  this  completion. 

3.4  MSP(k)  Machines 

As  we  have  stated  several  times  previously,  the  main  line  of  our  interest 
is  to  investigate  the  effects  of  replacing  a  state  in  a  parsing  machine  by 
its  split  equivalent;  that  is,  by  a  base  state  and  some  number  of  attached 
predictive  states.  We  have  briefly  described  how  this  replacement  would  be 
effected  and  in  what  manne.  the  resulting  machine  would  operate.  We  now 
would  like  to  study  this  replacement  more  carefully.  The  first  step  i3  to 
develop  a  formal  model  for  the  kin^  of  parsing  machine  that  results  after  a 
state  of  an  Ii!(k}  machine  is  replaced  by  its  split  equivalent.  Since  we  will 
tie  Interested  in  possibly  splitting  more  than  one  state  of  a  machine,  we  will 
allow  for  multiple  split  states  in  this  new  kind  of  machine.  The  operation 
of  this  machine  will  be  as  described  previously.  It  will  operate  on  a  stack 
of  stacks,  reading  and  reducing  on  the  topmost  stack  level  just  like  an 
LR(k)  machine,  until  it  encounters  a  predict  cr  suspend  transition,  r  which 
point  it  will  either  create  or  delete  the  top  stacK  level.  We  will  call  this 
machine  a  .ltiple  stack  parsing  (M3P(k))  machine;  tne  k  indicates  the  length 
of  the  lookahead  the  machine  may  use  during  its  operation. 
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An  MSP(k)  machine  M  is  always  associated  with  some  grammar  G,  which, 
for  the  time  being,  we  will  require  to  he  LR(k).  There  are  four  kinds  of 
states  in  an  MSP(k)  machine.  First  of  all,  there  are  the  final  states: 
there  is  exactly  one  final  state  for  each  rule  of  the  graumar  G,  plus  one 
additional  state  POP.  Then  there  are  the  initial  states:  these  include 
the  starting  state  of  the  machine,  as  well  as  all  predictive  states  of 
state-splittings.  Then  there  are  the  base  stares,  which  are  the  states 
from  which  predictions  are  rade.  And  finally,  there  are  Intermediate  states, 
which  are  regular  ga  den-vaiiety  LR(k)  states.  Each  state  in  these  last 
three  classes  is  comprised  of  a  set  of  LR(k)-items  over  G,  but  these  states 
a~e  not  always  the  completions  of  their  essential  items,  as  was  the  case  with 
LR(k)  states.  There  are  precise  rules  governing  the  composition  of  these 
states. 

There  are  several  functions  which  tie  the  states  together.  The  regular 
transition  function  f  takes  an  initial,  base,  or  intermediate  state  into  a 
base,  intermediate,  cr  final  state,  in  the  same  way  that  the  LRHc)  transition 
function  operates.  Namely,  for  a  stafe  q^,  a  symbol  (terminal  or  nonterminal 
or  c)  a,  and  a  string  x  of  k  lookahead  symbols,  ^(q^  a,  x)  is  the  successor 
of  q^  on  a  with  lookahead  x.  This  successor  function  is  defined  as  follows. 

If  nere  is  no  item  of  q.  of  the  form  A  -*  a.op(cj)  with  x  f  FIRST^(pw),  then 
f^(q^,  3,  lj)  Is  undefined.  If  there  is  an  item  A  -+  a.cr(x)  in  q^,  then 
^(q^,  a,  !*  /  is  the  final  state  for  the  rule  A  -*  a  a.  Otherwise,  consider 
all  items  of  of  the  form  A  ^  a.nP''  ),  where  p  t  f  let  q^  he  the  sLa^e  (stipu¬ 
lated  to  be  unique)  who  essential  items  are  precisely  {A  -*  aj.  P(  ;) > .  Then 
f  ( q  , ,  °,  x)  =  q  .  Ve  shall  require  th?..  t*-?  states  be  so  constructed  that 

the  function  f  is  always  single-value  1. 

M 
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(In  cases  where  there  is  no  ambiguity,  we  shall  leave  off  the  subscript 

M  from  and  the  other  functions  and  sets  of  M.) 

M 

In  addition  to  this  almost  standard  transit  .on  function,  there  is  a 
predictive  transition  function  g^.  If  is  a  base  state  and  x  a  string  of 
k  terminal  symbols,  then  gj^qj^ac)  *-s  t^ie  predictive  successor  of  on  look- 
atiead  x.  If  g^(q^,  x)  is  undefined,  it  means  that  if  the  lookahead  is  x 
on  entry  to  q^,  no  prediction  is  to  be  made.  If  it  is  defined,  g^(q^,  x) 
is  the  initial  state  to  which  we  should  janp.  There  is  another  function 
associated  with  this  prediction.  If  q2  is  an  initial  state,  g2(q2) 
the  name  of  the  predicted  nonterminal  associated  with  q^.  Thus  upon  entry 
to  state  qj  with  the  lookahead  being  x,  we  would  jump  to  g^(q^,x)  and  predict 
g2(g^(qj,x)).  Of  course,  we  will  require  that  the  state  q^  and  all  its 
associated  predictive  successors  define  a  legal  state-splitting  of  some 
LR(k)  state;  and  that  the  set  of  x  such  -hat  g^(q^,x)  equals  q2  is  precisely 
^k^2^2^  w^t*1  respect  to  the  split  state. 

All  of  this  is  summarised  as  follows. 

Definition  3.35  Let  (5  be  an  LR(k)  grammar  (V^,  V^,  S,  P).  Then  an  MSP(k) 
machine  associated  with  G  is  a  tuple  (Q^,  Q2,  Q^»  <lg»  f,  g^»  82).  where: 

1)  each  of  Q^,  Q,,,  and  are  finite  3tate  sets,  each  of  whose 
elements  is  a  set  of  LR(k)-i terns  over  G 

2)  =  P  U  (POP),  the  set  of  final  states 

3)  qg  f  Q^,  the  starting  state 

4)  f:  (Qx  U  Q.  U  Q3)  X  (VN  U  VT  U  f^l)  X  V*  -  q2  IJ  Q3  U  Q4 

5)  gj/  x  -»  Qi,  the  predictive  transition 

6)  g2:  Q,  -  VN 

and  which  satisfy  Ae  following  properties: 
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7)  no  two  states  in  Q2  U  have  precisely  the  same  essential  items 

8)  every  element  of  Q2  is  the  completion  of  its  essential  items 

9)  if  ct  p  VT>  then  if  A  *♦  .£(«)  and  B  *♦  are  both  items 

of  the  same  state,  then  I  FIRST^(  cr  P2  t);  if  ct  c  U  V^,  then 
if  A  *♦  a.CT(to)  and  B  -*  P^.c^Ct)  are  both  items  of  the  same  state, 
then  w  t  FIRSTk(P2T) 

10)  if  A  *♦  a.  ct(oj)  is  an  item  of  q,  then  f(q,  cr,  u>)  is  the  member  of 

associated  with  A  -»  a  a;  if  A  -♦  .g(w)  is  an  item  of  q,  then 

f(q,  €,  u)  is  the  member  of  associated  with  A  •*  <£ 

11)  consider  the  set  of  items  of  q  of  the  form  B  -*  p,.^  (t),  where 

p^  4  €;  then  there  is  a  state  q*  whose  essential  items  are  the  items 
B  -♦  p.^  cr.  p,  (  t  ) ;  and  f(q,  cr,  u)  =  q'  for  each  u)  £  FIRSTk(P2T),  for 
some  item  B  -*  P^.crp^T)  in  q 

12)  g^q,^)  =  gjCq',1^)  ^  q  =  q';  chat  is »  each  element  of  Ql  is  the 

predictive  image  of  at  most  one  bass  state 

13)  for  any  q  £  Q3,  let  q  ,q2,...,qn  be  the  different  images  of  q 

under  g  ;  and  let  q'  =  q  U  q^^  U.  .-Uq^;  then  q'  is  an  LR(k)  state, 
the  completion  of  the  essential  items  of  q,  and  (q ,  ( (g2 (q^) ,q^) » 
(g2(q2),q2),... , (g2(qn),qn)))  is  a  splitting  of  q';  furthermore,  for 
any  i,  1  4  i  ^n,  x|  gj(q,x)  =  q^ )  =  l^Cq',  called  the 

predictive  language  of  g2(q^) 

,  *  k 

14)  g2(qn)  ~  S  and  qg  is  the  completion  of  S  in  the  context  of  — I 

f  (q ,  <7,  w)  =  POP  if  q  £  O  ,  <7  =  g2(q),  and  w  c  «,q '  ,  cr)  , 

where  q'  is  the  state  such  that  g^(q',  x)  =  q  for  some  x;  in  addition, 

f(q0,  S>H  k)  “ 


15) 


16)  f(q,  cr,  x)  or  g^(q,x)  is  undefined  unless  its  value  is  given  by 
one  of  the  above  clauses 

Admittedly  this  is  a  very  complex  definition,  but  we  have  made  no 
claims  about  the  simplicity  of  this  machine  model.  All  we  are  really  after 
is  the  formalization  of  what  the  result  is  of  performing  several  state- 
splittinga  on  an  LR(k)  machine.  Thus  we  have  designed  MSP(k)  macb  .nes  so 
that  they  hang  together  in  just  the  right  way.  In  an  MSP(k)  iaa_..ine, 

is  the  set  of  initial  states,  Q ^  the  intermediates,  the  base  states, 
and  the  final  states.  Intuitively,  an  LR(k)  machine  has  one  initi.-.l 
state,  the  set  of  final  states,  and  all  the  rest  intermediates.  Each  time 
a  splitting  is  performed,  a  base  and  several  initials  replace  an  intermediate 
Note  however  that  the  essentials  of  the  new  base  are  precisely  those  of  the 
replaced  intermediate,  and  that  in  some  sense  the  vanished  intermediate  could 
be  reconstructed  by  combining  together  the  base  and  all  the  predic rives. 
Intutively,  we  have  formulated  the  definition  of  an  MSP(k)  machine  so  that 
it  will  be  satisfied  by  the  result  of  performing  arbitrarily  many  such  repl3c 
ments  of  an  intermediate  by  a  splitting,  afeer  each  replacement  recomputing 
successors  of  the  affected  states. 

Looking  at  it  this  way,  the  starting  state  is  associated  with  the 
nonterminal  S,  and  is  precisely  the  starting  state  of  an  LR(k)  machine  for 
G.  3y  conditions  9,  10,  and  11,  the  successors  of  are  precisely  the  same 
successors  that  it  would  have  in  an  I>R(k)  machine.  This  is  true  until  we 
reach  a  successor  that  has  been  "exploded",  that  has  been  replaced  by  a 
splitting.  Condition  13  requires  that  if  so-.-  state  is  not  a  full  fledged 
intermediate  state,  but  is  just  a  base  state  with  the  same  essentials  as 
the  intermediate  that  "should"  he  there,  then  the  base  and  its  associated 
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initials  do  indeed  define  a  legal  splitting  of  the  mipaing  intermediate. 

The  missing  intermediate  is,  of  course,  the  completion  of  its  essential 
items,  and  this  is  what  the  base  and  initials  must  add  up  to.  lhe  process 
of  computing  successors  of  the  base  state  and  its  predictive  states  continues 
on  in  the  same  way. 

The  other  properties  specify  various  constraints  to  keep  the  model 
accurate.  Everv  initial  state  is  associated  with  some  nonterminal,  and  is 
attached  to  at  most  one  bas*1  state.  The  set  of  strings  which  cause  transfer 
to  be  made  from  a  base  to  one  of  its  attached  initials  and  which  cause 
prediction  of  the  associated  nonterminal  to  be  made,  must  be  precisely  the 
lookaheads  generated  by  that  nonterminal  in  the  state  which  was  replaced 
by  the  splitting.  Furthermore,  transition  is  made  to  POP  cnly  from  an  initial 
state  and  only  on  the  nonterminal  associated  with  the  initial  state  and  Only  when 
the  lookahead  is  in  the  follow  set  of  the  nonterminal  with  respect  to  the  asso¬ 
ciated  base  state  (this  latter  implying  that  the  prediction  has  been  completed). 

Finally,  the  reason  that  no  two  states  may  have  the  same  essential  items 
is  that  if  two  LR(k)  states  have  the  same  essential  items,  then  they  are 
the  same  state.  Thus  ther,  can  be  only  one  intermediate  state  with  a  given 
set  of  essential  items.  Now  state-solitting  leaves  all  t'ne  essential  items 
in  the  base  state;  rhus  splitting  a  scate  can  never  create  a  base  state  with 
the  same  essential  items  as  some  other  state.  So  if  two  states  do  have  the 
same  set  of  essentials,  they  both  must  be  derived  from  the  same  intermediate. 

But  either  an  intermediate  state  is  replaced  by  a  state-splitting,  or  it  is 
not;  and  if  it  is,  it  is  replaced  by  only  one  base  state  with  attached  initial 
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states,  not  by  several  bases.  Thus  merely  by  performing  several  successive 
state-splittings  on  an  LR(k)  machine,  it  is  impossible  to  construct  two 
states  with  the  same  essential  items;  therefore,  we  prohibit  such  an  event¬ 
uality  in  our  model  of  MSP(k)  machines. 

For  an  example  of  an  MSP(l)  machine,  let  G  be  the  LR(1)  grammar  S  -»  Ax, 
A  -»  aB,  B  -*  BA,  B  -*  b.  The  LR(1)  machine  for  G  is  shown  in  Figure  3.20. 
Recall  our  conventions  for  representing  LR(k&  machines:  Final  states  are 
represented  by  circles  containing  the  name  of  the  associated  rule,  while 
other  states  have  the  component  items  inside  a  box.  All  input  strings  are 
assumed  to  be  padded  with  —t  's,  so  — i  is  the  context  of  all  S-items. 

Permissible  lookaheads  for  a  transition  are  written  after  a  slash. 


Figure  3.20 


Figure  3.21  illustrates  one  possible  MSP(l)  machine  for  the  grammar  G. 
There  are  three  initial  states,  each  with  its  associated  nonterminal  written 
in  its  upper  left-hand  corner.  Two  of  the  initial  states  are  attached 
to  base  states  by  predictive  transitions,  a  predictive  transition  is  repre¬ 
sented  by  a  dotted  line,  and  the  lookaheads  which  cause  it  to  be  followed  are 
tTitten  next  to  it  after  a  slash.  Even  chough  POP  appears  twice  in  the 
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machine,  there  is  only  one  POP  state,  which  is  written  three  times  only 
to  facilitate  representation  of  the  machine.  We  shall  continue  to  use 
the  conventions  employed  here  in  our  further  representations  of  MSP(k) 
machines. 


POP^ 


1 

S/H 

S 

S  -» 

,Ax( — 1  )  - 

aB(x) 

\ 

A  -» 

2 

> 

A/x 

S  -♦  A.x(-i  ) 

3 

x/  H 

a/b 


B/x 


A  -*  a.B(x) 
- 7? - 


i/b 


/b 


B 


B  -*  .3A(x) 
B  -*  .b(x) 


8 


B/a 


A  -*  , ab(x) 


A/b 


POP 


B/x 


>POP 


b/x 


MSP(l)  Machine  for  G 
Figure  3.21 


For  the  sake  of  completeness  and  precision,  we  will  cast  this  example 
in  the  formal  terms  of  the  definition.  For  convenience  only,  we  have 
numbered  the  states  as  indicated  in  the  figure.  The  initial  state  set,  Q^, 
is  ( 1,6.7);  Qj,  the  set  of  incermediate  states,  equals  (2};  Q^,  the  base 
states,  is  {5,8};  and  Q^,  the  set  of  final  states,  is  3 ,4 ,9 ,10, POP) .  The 
starting  state  is  state  1.  The  values  for  the  function  f  are  given  by  the 
following  table: 
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f  (l,a,b)  =  5  f(6,a,b)  =  5 

f(l,A,x)  =  2  f (6 , A,x)  =  POP 

f(2,x,H  )  =  3  f(7,B,a)  =  8 

f(7,B,x)  =  POP 

f (5 ,B,x)  =  4  f(7,b,x)  =  10 

f(8,A,x)  =  9 

For  all  other  q,  a,  x  triplets,  f(q,  a,  x)  is  undefined. 

Similarly,  is  given  by  g^(5,b)  =  7,  g^(8,a)  =  6.  We  also  have 
g2(l)  =  S,  g2(6)  =  A,  and  g2(7)  =  B. 

It  is  not  hard  to  see  that  this  machine  does  indeed  satisfy  all  the 
requirements  of  the  definition.  By  inspection  we  see  that  no  two  states 
have  the  same  essential  items.  State  2  is  the  only  member  of  Q2>  and  It  has 

precisely  one  essential  item  which  is  its  own  completion.  Properties  9,  10. 

and  11  are  seen  to  he  true  by  examining  the  contents  of  the  states  and  the 
nature  of  the  f-values  giver  above.  It  is  also  clear  that  each  initial 
state  is  the  image  under  g^  of  at  most  one  base  state:  state  7  is  the 
g^~image  of  state  5,  8  of  6,  and  1  of  no  sta^a.  Finally,  it  is  easy  enough 
to  establish  that  states  5  and  7  do  indeed  define  a  legal  state-splitting; 
that  those  a  for  which  gj(5,a)  =  7  equals  precisely  L^(g2(7))  =  L^(B);  and 
that  f(7,B,c)  =  POP  for  the  appropriate  /  (namely,  FOLLOW^ (5 ,B) ).  Analogous 
statements  can  be  verified  for  states  8  and  6. 

Intuitively,  it  is  possible  to  think  of  this  MSP(l)  machine  as  being  the 
result  of  splitting  two  of  the  states  of  the  LR(1)  machine  of  Figure  3.20. 


Before  we  proceed  to  describe  the  operation  of  an  MSP(k)  machine,  we 
refine  the  model  a  little  further.  We  shall  only  be  interested  in  dealing 
with  MSP(k)  machines  that  have  no  useless  states,  that  can  not  be  reached 
from  the  starting  state. 

Definition  3.36  Let  M  be  an  MSP(k)  machine,  q  and  q'  states  of  M.  Then 

q 1  is  immediately  accessible  from  q  if  q1  =  f(q,a,T)  for  a  p  V  U  V_  U  {c} 

N  T 

lc  lc 

and  T  f  V^,  ,  or  if  q'  =  g^(q,T)  for  some  T  £  .  Thr  state  q'  is  accessible 

from  q  if  there  is  a  sequence  of  states  of  M,  q  =  q^,q2»...,qn  =  q'  such 

that  q^+^  is  immediately  accessible  from  q^.  A  state  q  of  M  is  accessible  if 
it  is  accessible  from  a^,  the  initial  state  of  M. 

Definition  3.37  An  MSP(k)  machine  is  reduced  if  and  only  if  every  one  of  its 
states  is  accessible. 

We  restrict  our  attention  from  here  on  to  reduced  MS?(k)  machines; 

.  (k)  machine"  will  mean  "reduced  MSP(k)  machine". 

We  note  that  in  a  reduced  M3P(k)  machine,  every  initial  state  other  than 
the  starting  state  is  guaranteed  to  be  attached  to  exactly  one  base  state. 

By  definition,  it  is  attached  to  at  most  one;  while  if  it  is  attached  to  none, 
it  would  not  be  accessible  from  the  starting  state,  and  hence  would  not  be 
in  a  reduced  machine.  The  starting  state  may  or  may  not  be  attached  to  u  base 
state. 

In  order  to  describe  the  operation  of  an  MSP(k)  machine,  we  establish 
the  following  result. 

Lemma  3.38  Let  M  be  an  MSP(k)  machine  for  G,  q  a  state  of  M.  Then: 
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k 

i)  g^(q,x)  is  single-valued,  for  any  x  p 

k 

11)  f(q,0,x)  is  single-valued,  for  any  a  £  U  Uff1  and  x  £  V^, 

iii)  at  most  one  of  f(n,  p,  0*/k),  f(q,  0,x),  and  g..(q,  ox/k)  is 

k  1 

defined,  for  any  0  €  V^,  and  x  f 

Proof 

i)  If  g, (q,x)  is  defined,  then  q  is  a  base  state  and  g^(q,x)  is  one 
of  its  associate!.'  initial  states.  By  definition,  q  and  its  associated  initial 
states  fora  a  valid  splitting  of  some  state.  Suppose  that  g^(q,x)  = 
and  g1(q,x)  =  q2;  then  x  £  l^Cq'.A^  P  L^'q'.Ap,  where  Ai  =  g2(qi>  and  q' 
is  the  state  of  whose  splitting  q  is  the  base.  But  by  definition  of  state¬ 
splitting,  this  is  an  illegal  state  of  affairs. 

ii)  Suppose  f(q,  a,  x)  had  two  values,  for  some  a  and  x.  Call  these  two 

values  q^  and  q2>  If  both  q^  and  q2  are  final  3tates  corresponding  to  rules 
of  G,  these  two  rules  must  be  A^  ->  CI^0  and  A 2  -»  c^o.  Then  there  must  be  the 
following  items  in  q:  A^  -»  a^.o(x)  and  -*  a 2.c(x).  But  this  is  impossible 

by  condition  9  of  the  definition. 

By  condition  11,  there  is  at  most  one  non-final  state  which  is  a  0- 

successor  of  q,  so  both  q^  and  q2  can  not  be  non-final.  If  q^  is  final, 

associated  with  A  00,  and  q2  is  non- final,  then  there  mu3t  be  items 

A  -*  a. o(y)  and  B  -♦  P.«cr?9(u),  with  x  P  FIRST  (P  u),  in  q,  which  is  again 

\  L  k  z 

impossible  by  condition  9. 

So  the  only  other  possibility  is  that  q^  is  POP  and  that  q2  is  some 

k  k 

other  state.  We  know  that  f(qg,S,  — H  )  =  POP;  if  f(qg,S,  — |  )  equals 

some  other  state  as  ve  l,  there  must  be  an  item  A  *♦  .S(  — |  )  in  q^,  where 

*  * 

3  =*  A;  thus  we  would  have  S  =»  S,  which  is  impossible  in  an  unambiguous 
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granmar  (which  every  IJl(k)  grammar  is).  The  only  other  possibility  is  that 
f(q,a,x)  =  POP  iff  a  =  g2(q)  and  x  is  in  the  k-follow  set  of  a  in  the  base 
state  which  q  is  associated  with.  But  if  f(q,  a,  x)  also  equals  some  other 
state,  there  is  some  item  in  q,  B  -*  p^.ap2(w)>  with  x  f  FIRST^O^).  Thus 
x  £-  FOLLOW^(ct,  q)  and  x  f  FOLLOW^(a,  q'),  where  q'  is  the  associated  base. 
But  since  q  and  q',  together  with  any  other  associated  initial  states,  form 
a  state-splitting,  this  is  impossible,  by  the  definition  of  state-splitting. 


iii)  The  fact  that  both  f(q,  o,  x)  and  f(q,  pf  ox/k)  can  not  both  be 
defined  follows  directly  from  condition  9  of  the  definition  of  MSP(k) 
machines.  Suppose  then  that  f(q,  a,  y)  and  g^(q,  ox/k)  were  both  defined 
for  some  a  and  x.  Then  q  must  be  an  element  of  Q3  for  g^  to  be  defined. 

Then  g^",  ox/k)  is  a  state  in  Q^,  say  q';  let  A  =  g2(g^(q,  ox/k)),  the 
nonterminal  associated  with  q'.  Since  g^(q,  ox/k)  is  defined,  by  condition 
13,  ox/k  f  L^(q",  A),  where  q"  is  the  state  created  by  combining  q  and  all 
its  associated  initial  states  (all  initial  states  q^ such  that  q^  =  g^(q  t) 
for  some  t).  Since  q  and  these  associated  initial  states  define  a  splitting 
of  q",  any  item  in  T(q,r,  Lj^q*,  A))  will  be  in  q'  and  not  in  q  (since  q'  is 
the  initial  state  associated  with  A);  this  is  by  Corollary  3.33.  But  if 
f(q,  a,  x)  is  defined,  there  must  be  an  item  B  P^.crP2(i  ),  x  p  FIRST^ (f^) > 
in  the  state  q.  But  since  ox/k  p  L,  (q",A),  the  item  B  -*  p^.op^^)  is  in 
T(q",  L^(q'r,A)),  and  so  B  -♦  Pj.o02(w)  can  not  be  in  q,  the  base  state  of 
the  splitting  of  q".  We  thus  have  a  contradiction,  and  our  contention 


is  proved. 
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The  same  technique  shows  that  f(q,  £,  ox/k)  and  g^(q,  ax/k)  can  not 
both  be  defined,  and  we  are  done.  Q.E.D. 

We  are  now  in  a  position  to  describe  the  operation  of  an  MSP(k)  machine. 


Definition  3.39  Let  Mbe  an  MSPfk)  machine  for  G.  A  stack  level  is  an  element 

* 

of  ' (V  • (Q2  U  Q^))  *  V^;  that  is,  a  string  of  the  form 

A  q^.  x  q ^  x2  q2...xn  q^  B,  where  q^  is  an  initial  state  such  that  A  =  g2^0^’ 
each  x^  is  in  U  V^,,  and  n  2  0.  A  topmost  stack  level  is  an  element  of 

V  *  Qj  * (V  * (Q2  U  Q3))  • (K  U  V  •  (Q2  U  U  Q^)),  that  is,  a  string  of  the 
form  A  x^  q^  x2  q2  ...  x^  qfi,  where  n  2  0,  each  x^  is  in  U  V^,  and  may 
possibly  be  a  final  state.  A  stack  is  a  sequence  of  the  form  t  A  A  ... 

A  J?m  A  t,  where  m  2  0,  each  is  a  stack  level,  A  is  a  new  marking  symbol, 
and  t  is  a  topmost  stack  level. 


Definition  3.40  A  configuration  of  the  MSP(k)  machine  M  is  a  triple  (q .  a,  u>) , 

*  k 

where  q  is  a  state  of  M,  a  is  a  stack  of  M,  and  )  c  V^, 

These  definitions  reflect  the  operation  of  the  machine  as  we  have  dis¬ 
cussed  it  previously.  In  a  configuration  (q,  0.,  u),  q  represents  the  state 
that  the  machine  M  is  in,  a  denotes  the  contents  of  M’s  stack,  and  a  is  the 
remainder  of  the  input  string  that  has  not  been  read  yet.  (Observe  that  the 
input  to  an  MSP(k)  machine  is  always  padded  with  k  end-markers.)  In  practice, 
the  state  q  will  also  be  the  la3t  element  on  the  topmost  stack  level  of  a. 

The  organization  of  the  stack  accurately  mirrors  the  meanings  of  the  stack 
levels.  Processing  proceeds  on  the  topmost  level,  which  resonbles  any  IK(k) 
stack,  until  it  is  time  to  create  a  new  level.  At  that  time,  the  name  of 
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the  predicted  nonterminal  is  written  both  on  the  old  level  (the  last  symbol 
cm  a  stack  level)  and  on  the  new  one  (the  first  symbol  on  a  level,  topmost 
or  not).  Also  the  name  of  the  Initial  state  jumped  to  is  written  on  the 
new  level,  this  being  the  element  of  that  begins  any  level.  Note  that 
only  the  last  state  of  the  topmost  level  can  be  final. 

A  move  by  M  is  represented  by  the  relation  f—  on  configurations. 

Definition  3.41  Let  M  be  an  MSP(k)  machine  for  G.  The  relation  f—  on 
configurations  of  M  is  defined  by: 

1)  (q,  a>  o*5)  f —  (q^»  w)  if  f(q,  <7,  u/k)  is  defined, 

where  cr  f  U  (f)  and  q^  »  f(q,  a,  co/k) 

2)  (q,  a,  u>)  (q^ ,  a  A  A  A  q^,  w)  If  g^q,  u/k)  is  defined, 

where  ^  =  g1(q,  w/k)  and  A  =  g^q^ 

3)  (q»  ^1T192T2q3*“TmC1,L)^  (q '  j^Aq '  »u)  if  q  is  the  final 

state  for  the  rule  A  -♦  T  t  .T  where  q’  =  f(q,  ,  A,  w/k) 

l  i  m  1 

4)  (q>  A  1  A  q2  A  q,  u)  (q,  ^Aq^w)  if  q  is  the  POP 

state,  where  A  =  g2^)  an<i  ^3  =  f(qj,A,u/k) 

This  is  just  a  fovmalization  of  the  machine  operation  that  we  have  been 
describing  all  along.  The  first  case  describes  the  effect  of  reading  the 
symbol  cr  onto  the  topmost  level  of  the  stack.  Both  the  symbol  a  and  the  new 

state  q^  are  written  on  the  stack,  and  a  is  removed  from  the  input.  In  thi 

case,  the  new  configuration  is  called  a  read-successor  of  the  old  configura¬ 
tion. 

The  second  case  describes  how  the  configuration  reflects  the  making  of 
a  prediction.  If  the  machine  was  in  a  oase  state  with  the  lookahead  indicating 
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that  a  prediction  should  be  made  (that  is,  if  g^(q,  w/k)  is  defined),  then 

the  name  of  the  predicted  nonterminal  is  written  on  the  former  topmost  stack 

level  and  a  new  topmost  level  is  created;  on  it  too  is  written  the  name  of 

the  predicted  nonterminal  as  well  as  the  appropriate  initial  state  with 

which  that  nonterminal  is  associated.  In  this  case,  the  new  configuration 

is  a  prediction  successor  of  the  old  one. 

"i.;e  third  case  shows  how  a  reduction  is  made  on  the  topmost  level. 

If  the  state  of  the  old  configuration  is  the  final  state  for  A  -*  T  T  . .. t  , 

then  is  wiped  off  the  topmost  level  as  well  as  the  interleaved 

state  names,  while  A  and  the  new  state  name  are  written  on  the  stack.  The 

new  state  is  computed  using  A,  the  old  state  '-ed  by  popping  off 

t, ...T  ,  and  the  lookahead  u/k.  Here  the  new  c«  i».i 'ration  is  a  reduction 
1  m  °  - 

successor  of  the  old  one. 

Finally,  the  last  case  shows  how  a  prediction  is  fulfilled  and  the 
topmost  stack  level  eliminated.  If  the  state  of  the  configuration  is  the  POP 
state,  then  the  topmost  stack  level  is  just  dispensed  with;  the  new  state 
is  computed  by  dropping  back  to  the  next-to-top  level,  and  observing  the 
state  from  which  the  prediction  of  the  A  was  mad' .  The  lookahead  used 
is  of  course  u/k,  the  lookahead  at  the  time  the  A  was  found  and  POP  entered. 
We  call  the  new  configuration  here  a  suspension  successor  of  the  old  one. 

Let  us  note  that  in  each  of  these  cases,  the  new  configuration  is 
indeed  a  well-defined  configuration.  In  particular,  the  s"ack  component 
is  a  properly  constructed  stack,  with  the  topmost  stack  level  ending  with 


a  state  as  it  should. 
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Theorem  3,42  If  H  is  an  MSP(k)  machine  for  the  gramnar  G,  then  M  is 
deterministic.  That  is,  for  any  configuration  (q,  a,  u)  there  is  at  most 
one  configuration  (q * ,  a',  w')  such  that  (q,  a,  u)  H"  (q',  a',  u'). 

Proof  The  state  q  is  either  a  final  state  for  a  production,  the  POP  state, 
or  something  else.  If  it  is  a  final  state,  only  the  third  clause  of  the 
definition  of  can  hold;  since  f  is  single-valued,  this  means  (q,  (X,  w) 
will  have  at  most  one  successor.  Tf  q  is  POP,  only  the  fourth  clause  can 
be  relevant,  and  again  there  can  be  at  most  one  successor.  So  assume  q  is 
non-final.  Then  if  (q,  a,  w)  (q',  a',  to ’ ) ,  the  latter  must  be  either 

a  read  or  prediction  successor  of  the  former.  Let  u  =  ap,  where  a  f  V^. 

By  Lama  3.38,  g^(q,  u/k)  is  not  defined  if  either  f(q,  a,  p/k)  or 
f(q,  £,u/k)  is.  Hence,  (q,  a,  u)  can  not  have  both  read  and  prediction 
successors.  Now  if  (q,  a,  w)  does  have  a  prediction  successor,  it  must  be 
unique  by  the  single-valuedness  of  g^.  On  the  other  hand,  both  f(q,  a,  p/k) 
and  f(q,  £,  m/k)  are  not  defined,  again  by  Lonroa  3.38;  therefore  by  the 
single-valuedness  of  f,  (q,  a,  u)  has  at  most  one  read  successor,  and  we 
are  done.  Q.E.D. 

*  < 

We  will  let  r~  denote  the  reflexive,  transitive  closure  of  . 

Definition  3.43  An  initial  configuration  of  M  is  one  of  the  form 

Jr 

(s0,  Sq0,  u  H  );  that  is,  one  where  the  state  is  the  starting  state  of  M 

* 

and  the  stack  has  one  level  consisting  of  S  and  qg.  A  string  w  in  VT  is 
accepted  by  M  If  (qQf  SqQ>  w  — <  )  r~  (q,  Sq^Sq,  — I  )»  where  q  Is  the 

POP  state.  That  is,  if  starting  M  off  with  w  as  the  input,  M  eventually 
reads  al1  and  ends  up  with  a  one- level  stack  which  shows  that  S  has 
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been  found.  The  language  of  M.  L(M),  is  (u  |  u  is  accepted  by  M} .  If 
w  f  L(M),  then  the  sequence  of  configurations  Cq.c^,. . .  ,cn>  where 
c0  =  <q0»Sq0»u  “*  k)»  cn  =  (POP »SqQ  8  POP ,  H  k)  and  Ci  f-  c1+1,  is 
called  the  accepting  sequence  of  M  for  to.  Observe  by  Theorem  3.42,  that 
this  sequence  is  unique.  In  general,  we  let  c^  =  (q^,  Ot^ ,  — |  ). 

3.5  Replacing  a  State  with  a  Splitting 

Now  that  we  have  some  feeling  for  the  nature  and  design  of  MSP(k) 
machines,  let  us  show  that  it  indeed  is  a  good  model  for  the  result  of 
splitting  states  in  an  LR(k)  machine.  That  is,  we  shall  show  that  if  we 
start  with  an  Ul(k)  machine  and  split  one  of  its  states;  take  the  result 
and  split  one  of  its  states;  and  continue  this  process,  then  at  every 
stage  we  will  be  dealing  with  a  well-constructed  MSP(k  /  machine.  We  begin 
by  showing  that  this  process  starts  off  correctly. 

Proposition  3.44  If  M  is  the  LK(k)  machine  for  G,  then  M  is  an  MSP(k,' 
machine  associated  with  G. 

Proof  This  is  trivial  to  check.  We  let  be  just  the  st  irting  state  of  M, 
Q2  be  the  nonfinal  states,  the  final  states,  and  will  be  empty.  Then 
conditions  8,  9,  10,  and  11  are  true  because  G  is  LR(k)  and  by  the  method 
of  construction  of  the  canonical  LR(k)  machine  ior  a  grammar,  and  the  rest 
of  the  conditions  are  either  trivial  or  vacuously  true.  Q.E.D. 

We  further  note  that  the  conventional  way  for  describing  the  operation 
of  an  LR(k)  machine  coincides,  for  all  practical  purposes ,  with  our  speci¬ 
fication  of  how  it  operates  as  an  HSP(k)  machine. 
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Before  we  proceed  further,  we  can  use  some  additional  terminology. 


Definition  3.45  Let  q  and  q'  be  states  of  an  MSP(k)  machine  M.  Thenq  q 
dominates  q'  if  the  following  condition  holds:  for  every  sequence  of 
states  q^.q^,...,^  =  q',  starting  with  the  starting  state  and  ending  with 
q'  such  that  q^+^  is  immediately  accessible  from  q^,  it  must  be  that 


q  =  for  some  j , 


0  *  j  <  n. 


We  note  that  dominance  is  transitive:  if  q  dominates  q'  and  q'  dominates 


q",  then  q  dominates  q". 


Definition  3.46  Let  qj,q2****>9n  be  a  sequence  of  states  of  M  such  :hat 
qt+1  is  immediately  accessible  from  q^.  If  q^+1  =  f(q^,  a,  T)  for  a  and  T, 
let  cr^  =  a;  if  q,.,  =  gj(qT»  T)  for  some  r,  let  CT^  =  f.  Then  is 
accessible  from  q^  by  a,  where  a  is  the  string  a]Cr2*  * ’^-l* 

This  is  just  an  extension  of  the  notion  of  a-successor  of  a  state,  that 
is  conmon  in  machine  theory.  Since  a  base  and  its  .  redictives  are  in  some 
sense  a  single  state,  the  next  state  in  a  sequence  to  determine  an  Q-successor 
can  either  be  a  successor  of  a  base  or  a  successor  of  one  of  its  initials. 

Let  us  recall  the  conventional  definition  of  successor. 


Definition  3,47  It  q'  =  f(qv  a,  T),  then  q'  is  a  a-successor  of  q;  in 

addition,  a  is  an  <r  •  successor  of  itself.  If  there  is  a  sequence  q  =  q^,  , 

q^  =  q',  s'.ch  that  q^^  is  a  ^-successor  of  q^,  then  q '  is  a  a-successor 

of  o,  where  Ct  =  a  a  ...a  . 

’  1  i  n- i 

Lemna  3.48  Let  MQ  be  the  LR(k)  machine  for  G,  M  an  MSP(k)  machine  for  G; 
qQ  is  the  starting  state  of  qQ'  the  starting  state  of  M.  Then  the 
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non-final  state  of  Mq  which  is  an  a-successor  of  is  the  union  of  all 
non-final  Siatec  of  M  which  are  accessible  by  Cl  from  qg'. 

Proof  Let  us  first  observe  that  in  Mq,  there  is  at  most  one  njn-final  state 
which  is  an  a-successor  of  qQ,  for  any  a.  This  is  easy  to  show  by  induction 
on  the  length  of  a,  3ince  there  is  at  fjost  one  non-final  state  which  is  a 
a-successor  of  q,  for  any  state  q  and  symbol  a. 

Further  note  that  the  statement  of  the  theorem  assumes  that  there  is  a 
non-final  a-succesjor  of  q^  in  Mq  if  and  only  if  sane  non-final  states  are 
accessible  by  1  from  q^'  in  H.  This  statement  will  come  for  free  in  the 
proof  of  the  lemma. 

The  proof  is  by  induction  on  the  length  ot  a.  If  |aj  =  0,  then  a  =  f. 
In  Mq,  there  are  ^'transitions  f^om  qQ  to  a  non-final  state,  since  f 

labels  only  transitions  to  final  states  for  rules  like  A  -*  £;  hence  the 
single  non-final  state  of  MQ  in  this  case  is  ><q.  Similarly  in  M,  there  are 
no  f-transitions  from  q^'  to  non-final  states;  and  since  q^'  is  an  initial 
state,  it  has  no  attached  initial  states.  Hence  the  only  non-final  state 
accessible  from  qQ'  by  f  in  M,  is  qQ'  itself.  But  qQ  =  qQ'  by  construction 
of  M  and  Mq,  so  this  case  holds. 

Suppose  chen  that  the  statement  is  true  for  |aj  =  n;  we  shall  show  it 
true  if  lal  =®  n+1.  Let  a  =  pa,  where  IP  I  =  n  and  |al  =1.  Then  a  state 
».€  Mq  is  an  a-successor  of  qQ  if  and  only  if  it  is  a  a-ruccessor  of  a 
p-successor  of  qQ.  By  our  remarks  on  f-transitions,  q  is  a  non-final 
a-successor  of  qQ  if  and  only  if  q  =  f(q^,a,T),  where  q^  is  a  non-final 
p-successor  of  qQ.  By  induction,  there  are  non-final  states  of  M, 
qi'  ,q2' ,... ,qm' ,  each  of  which  is  accessible  by  p  from  qQ1, 


such  that 
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^1  *  U  q^'*  That  means  that  the  items  cf  q  are  somehow  spread  out  among 
the  q^1. 

Now  consider  the  state  q.  It  ia  the  completion  of  its  essential  items , 

which  i’.i  turn  are  formed  by  moving  the  dot  one  place  to  the  right  on  all 

items  of  q^  of  the  form  A  -*  x. oy(w),  where  y  t  These  items  occur 
somewhere  in  the  q^';  there  will  be  a  state  accessible  by  a  from  some 
particular  q^’  if  and  only  if  there  are  one  or  more  such  .a  items  in  it. 

The  nan-final  3tates  accessible  by  a  from  such  a  q^'  include  the  non-final 
{^successor  of  <j  *  ,  and  any  attached  initial  states, if  that  successor  is  a 

base  state.  Thus  the  union  of  the  non-final  states  accessible  by  c  from 

fonn  the  completion  of  the  essential  items  of  the  CT-successor  of  q^'. 
Since  q^'  c  q^,  this  union  is  contained  in  the  CT-successor  of  q^;  hence  the 
union  of  all  these  accessible  states,  for  each  q^ ' ,  is  contained  in  q  as 
well.  But  any  item  of  q  is  a  descendant  of  one  of  the  essential  items  of 
q.  Any  essential  item  of  q  is  contained  in  the  CT-successor  of  somi  q^'; 
any  descendanc  of  this  essential  item  is  either  in  this  successor  or  in  one 
of  its  predictives.  In  either  event,  any  item  of  q  is  contained  in  a  non¬ 
final  state  accessible  from  qg'  by  fJCT  -  a;  and  so  q  equals  the  union  of 
these  states  and  we  are  done.  Q.E.D. 

Corollary  3.49  Every  non-final  state  of  an  MSP(k)  machine  for  G  is  contained 
in  some  state  of  the  LR(k)  machine  for  G. 

We  now  describe  the  process  by  which  a  state  of  a  machine  may  be  replaced 
by  l  split  version  of  itself. 
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Algoritnm  3.50  Let  M  be  an  MSP(k)  machine  for  G,  q  f  an  intermediate 
state  of  M„  and  (B,Q)  a  splitting  of  q.  Then  we  replace  q  by  (B,Q)  by 
performing  the  following  steps: 

I)  add  B  to  Q3  and  each  P^  in  Q  to  Q^;  define  gjO?^)  ~ 

and  gjCB.x)  =  P^  for  each  x  €  L^Cq.A^);  define  = 

POP  if  oj  ^  FOLLOWk(Ai,  B). 

II)  remove  the  state  q  from  the  set  Q^;  for  any  q^  in  M,  if 
fCqj*0.^)  a  q,  set  f(qlSo,(J)  =  B 

III)  designate  B  and  each  P^  as  a  "new  state";  recursively  apply 

step  IV  to  B  and  each  P^  and  to  any  other  "new  states"  created 
by  step  IV 

IV)  let  q ^  be  a  new  state; 

a)  for  each  symbol  a,  let  E  be  the  Items  in  q  of  the  form 

c 

A  -*  a.  op(w)t  where  P  c,  and  let  be  the  corresponding 
set  of  items  A  -*  CXa,p(w); 

b)  define  q'  as  follows:  If  E'  equals  the  set  of  essential 

c 

items  of  any  state  q"  currentlv  in  Qj  or  ,  then  q'  =  q‘*; 
otherwise,  q'  is  the  completion  of  the  set  of  essential  items 
E^,  it  is  added  to  the  set  and  it  is  also  defined  as  a 
"new  state"; 

c)  then  for  any  T  such  that  T  f  FIRST^fpw)  for  sane  item 
A  -»  a.oP(w)  in  Ef;,  let  f(q1>a,T)  =  q ' ; 

d)  and  for  any  w  such  that  A  -*  a.o^w)  is  an  item  of  q,  let 
f(q^,a,w)  equal  the  final  state  associated  with  the  granmar 
rule  A  ■+  a  a; 

e)  after  the  definition  of  f(qj,a,u>)  for  each  a  and  u>.  remove 
qx  from  the  list  of  new  states. 
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V)  eliminate  all  states  iu  Q^,  Q2,  01  which  are  no  longer 
accessible  frxa  q^;  the  modified  values  of  Q  ,  Q2,  Q^,  f, 
and  g^,  together  with  and  q^,  define  the  result 
of  this  algorithm. 

This  replacement  algorithm  is  quite  straightforward.  The  old  state 
is  discarded,  replaced  by  the  base  state  which  is  appropriately  attached 
to  i>:s  associated  initials.  All  transitions  into  the  old  state  are 
reconnected  to  the  base  of  the  splitting.  Then  successors  of  the  base  and 
initials  are  recursively  computed  in  the  conventional  way,  with  one  excep¬ 
tion:  should  the  conventional  successor  of  a  state  be  an  intermediate  state 
which  has  already  been  replaced  by  a  state-sp  li  :ting,  the  actual  successor 
is  instead  chosen  to  be  the  base  of  that  state-splitting.  This  is  to  prevent 
re-introduction  of  a  state  that  has  already  been  eliminated,  and  to  avoid 
having  two  states  with  identical  essential  items.  We  note  in  passing,  that 
it  is  both  possible  and  legal  to  have  different  initial  states  which  consist 
of  precisely  the  same  set  of  items;  because  initial  states  do  not  have  any 
essential  items,  this  would  not  violate  the  definition.  However,  if  two 
different  initial  states  do  have  the  same  set  of  items,  their  successors 
will  be  the  sane  states.  We  shall  see  later  '..hen  it  is  possible  to  merge 
different  copies  of  the  same  initial  state. 

After  ail  successors  of  the  newly  introduced  states  have  been  computed, 
all  leftover  states  from  the  original  machine  which  can  no  longer  be  reached 
in  the  new  version  are  eliminated.  Thus  it  is  possible  for  one  state-splitting 
to  "undo'*  the  work  of  another.  For  example,  suppose  q^  is  the  o-succcssor 
of  qx,  and  that  both  qx  ar  d  q2  can  be  split.  If  we  first  split  q2>  then  the 
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base  of  the  splitting  will  be  the  new  a-successor  of  q^.  But  if  we  then 
split  q^,  it  may  happen  that  the  .a  items  of  q^  are  so  scattered  among  the 
base  and  initials  of  the  splitting  of  q^  that  the  F„  of  items  are  not 
all  together  in  some  state.  Then  tne  a-suc  essors  of  the  base  and  initials 
of  the  q^-splitting  will  each  contain  just  some  of  the  essentials  of  q2, 
and  these  successors  will  be  comouted  as  new  states.  And  unless  there  are 
other  transitions  to  it,  the  base  state  of  the  splitting  of  q2,  together 
with  its  associated  initial  states,  will  vanish.  In  general,  if  q  dominates 
q',  then  the  splitting  of  q  may  make  q'  no  longer  accessible,  leading  to 
ts  elimination  from  the  machine. 

We  must  show  two  things  about  Algorithm  3.50  in  order  for  it  to  be 
meaningful.  We  have  to  prove  that  it  always  terminates  and  that  the  result 
of  per  worming  the  replacement  as  specified  by  the  algorithm  will  be  a  well- 
constructed  MSP(k)  machine.  The  fact  that  it  terminates  is  immediate,  since 
the  algorithm  iterates  on  Step  IV  only  so  long  as  new  states  are  being 
produced;  and  there  can  be  only  finitely  many  new  states,  since  there  are 
only  finitely  many  collections  of  LR(k)  items  altogether,  and  once  a  new 
state  has  been  treated  it  is  never  generated  again  as  a  new  state. 

In  order  to  show  that  the  procedure  is  well-defined  and  that  the  result 
is  a  valid  MSP(k)  machine,  we  really  need  establish  one  result:  that  each 
new  state  generated  during  the  algorithm  satisfies  Clause  9  of  the  definition 
of  MSI’(k)  machines;  namely  that  if  A  (to )  and  B  Pj.oP^(t)  are  items  of 

the  state,  with  a  £  V^,  then  u  t  FIRST^ (ap^T) ;  and  that  if  A  a.  ct(w)  and 
B  -»  Pj.ap2(T)  are  items,  then  u  t  FIUST^Pj7)*  This  suffices  because 
replacing  a  by  a  splitting  does  not  affect  those  states  not  dominated  by  q. 
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and  adds  only  new  intermediate  states  to  the  machine.  But  this  required 
result  is  very  easy  to  establish.  Since  q  is  a  state  of  M,  it  is  contained 
in  q 1 ,  some  state  of  the  LR(k)  machine  for  G;  hence  the  same  is  true  of 
each  component  of  the  splitting.  Each  new  state  is  the  Ct-successor  of  some 
component  of  the  splitting  for  some  a,  and  is  readily  seen  to  be  contained 
in  an  Ot-successor  of  q'.  Since  clause  9  holds  for  any  LR(k)  3tate,  it  ho lea 
for  any  subset  of  one,  and  hence  for  each  new  state  created  by  the  algorithm 
Now  we  consider  the  other  side  of  the  coin. 


Theorem  3.51  Let  M  be  any  MSP(k)  machine  with  n  base  states,  n  >  0,  Then 
there  exists  an  HSP(k)  machine  M'  with  n-1  base  states,  and  a  state  q  of  K' 
and  a  splitting  (B,Q)  of  q,  such  that  M  is  the  result  of  replacing  q  in  M' 
by  th?  *>y  lit  ting  (B,Q). 


Proof  Va  will  describe  how  to  construct  the  machine  M'  from  M.  Let  the 
base  states  of  H  be  q^,  92’ "‘’^n’  Let  q^  be  any  one  of  these  base  states 
which  does  not  dominate  any  other  one.  (If  every  base  state  dominates  some 


other  one,  then  some  state  must  domainate  itself  since  there  are  only  finitely 
many,  and  that  is  ridiculous.)  Since  q^  is  a  base,  it  is  the  base  state  of 
some  splitting  (3,Q).  To  get  M'  from  M  we  will  "undo”  this  splitting. 


replacing  q^  by  the  r>cate  of  which  (B,Q)  is  a  splitting. 

We  let  q  =  q^  +  1  1  P^;  that  is,  the  urion  or  all  the  ic.em3  of  the  compo¬ 
nents  of  the  splitting  (B,Q).  We  observe  that  the  essentials  of  q  are  the 


same  as  those  of  q^.  In  order  to  get  M'  from  M,  we  do  'he.  following: 
eliminate  states  q^  and  P^,  for  each  7^  in  the  splitting,  v-.c  all  transitions 
formerly  going  into  q^  to  go  into  q compute  the  successors  of  q  in  he 


conventional  way;  and  final)  ,  eliminate  any  other  states  which  becoi  f 
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inaccessible.  When  we  say  "compute  the  successors  in  the  conventional 
w^y",  we  mean  that  the  essential  items  of  any  successor  state  are  compared 
with  the  essential  items  of  existing  states:  if  there  is  a  match,  that 
Sv.fJ  becomes  the  successor;  otherwise,  a  new  intermediate  state  is 
generated. 

It  is  straightforward  tc  show  that  M'  is  indeed  a  well-defined  HSP(k) 
machine.  This  new  machine  M'  differs  from  M  only  with  respect  to  the  state 
and  its  successors.  That  is,  all  states  of  M  that  can  be  reached  from 
without  going  through  q^  (i.e. ,  all  3tates  not  dominated  b.  q^)  are  in 
M'  as  well,  and  connected  in  the  same  way.  In  particular,  all  the  base  states 
of  M  other  than  q^  (.re  in  M'  too.  Since  no  new  base  states  are  generated 
in  computing  successota  of  the  new  state  q,  it  is  true  that  M'  has  only  n-1 
base  states. 

Now  let  us  consider  the  result  of  replacing  q  in  M'  by  the  splitting 
(B,Q) .  By  construction  of  q.  (B,Q)  is  indeed  a  splitting  of  q.  Those 
states  not  dominated  by  q^  in  M  are  not  dominated  by  q  in  H',  so  those 

J 

states  are  unaffected  by  replacing  q^  by  q,  and  subsequently  replacing  q 
by  the  splitting  (73, Q).  The  new  states  introduced  into  M'  as  successors 
of  q,  are  no  longer  accessible  once  q  is  split  into  (3,Q);  and  the  states 
eliminated  when  q^  and  the  are  eliminated,  are  reintroduced  when  (B,0) 
replaces  q  in  M'.  All  connections  among  the  states  of  M  that  are  severed 
when  q j  and  the  P^  are  removed  are  reconnected  when  they  are  reintroduced. 


Thus  M  indeed  results  when  q  is  replaced  by  (B,Q)  in  M'. 


Q.E.D. 


Theorem  3.5?  Let  M  be  any  MSP(k)  machine  for  G  with  n  base  states.  Then 

there  is  a  sequence  of  MSP(k)  machineo  for  G,  Mr ,’/L  , . . .  ,M  ,  such  th3t  M_  is 

v  I  n  u 
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the  m(k)  machine  for  G,  =  M,  and  Is  obtained  by  replacing  some 

state  of  by  s  splitting  of  It. 

Proof  immediate  from  the  preceding  theorem.  Q.E.D. 

This  last  theorem  Justifies  our  definition  of  MSP(k;  machin  s,  We  have 
already  3eer  that  starting  with  an  LR(k)  machine,  and  performing  a  seiJes 
of  strte-splittlrigs ,  always  results  in  an  MSF(k)  machine.  Now  we  see  that 
every  MSP(k)  machine  can  be  so  obtained.  Tlius  the  notion  of  MSP(k)  machine 
precisely  captures  the  notion  we  were  striving  for,  namely  ;he  result  of 
applying  a  number  of  state-spllttlnga  to  an  LR(k)  machine,  this  is  a  very 
good  state  of  affair-,  for  while  our  model  of  MSP(k)  machines  may  be  a  complex 
one  and  hence  MSP(k)  machines  may  be  difficult  to  design  from  scratch,  repla¬ 
cing  a  s»"»te  by  a  splitting  Is  an  elementary  procedure  and  one  that  is  easy 
to  perform  repeatedly,  starting  with  the  LR<k)  machine.  Thus  It  is  enough 
for  us  to  know  how  to  split  states  in  order  to  build  MSF(k)  machines. 

As  an  example  of  how  this  state-splitting  process  works,  consider  the 
LRXi)  t  •’chine  of  Figure  3.22.  Each  of  tht  three  intermediate  states  can  be 
spilt,  some  In  several  ways.  The  result  of  sequentially  replacing  two  of 
these  states  by  splittings  Is  shown  In  Figure  3.23;  we  do  net  bother  to 
show  how  this  Is  done  In  two  steps  since  these  two  states  occur  In  separate 


areas  of  the  machine. 


Now  suppose  that  we  want  to  split  the  remaining  intermediate  state  of 
this  MSP(l)  machine,  and  that  we  choose  a  splitting  such  that  the  only  item 
la  the  bAse  state  of  the  splitting  is  the  essential  item  S  -*  x.AD'— O. 

(It  is  straightforward  that  this  is  indeed  a  legal  splitting.)  Then  the 
result  of  this  splitting  is  shown  in  Figure  3.24. 


S  -♦  .xAD(-H  ) 


S  -*  x.  AD(  H  ) 


S  -♦  xA.D(  — I  > 
D  -♦  .d(  -H  ) 


S  -*  xAD 


A  -*  . AB(d) 
A  -*  .  AB(b) 
A  -»  .XC(d) 
A  **  ,XC(b) 
X  ■*  .  a(c) 


A  -»  X.C(d) 
A  -*  X.C(b) 


C  -*  .c  (d) 
C  -*  .c  (b) 


A  -♦  A.B(d) 
A  A.B(b) 
B  -*  .b(d) 

B  -♦  .b(b) 


A  -»  AB 


Figrire  3.24 


Ill 


Splitting  this  state  has  effected  changes  in  the  machine.  The 
A-successors  of  the  base  and  predictive  sf.ates  of  this  splitting  are  new 
states,  which  add  up  to  the  A-successor  of  the  state  being  split.  Moreover, 
the  A-successor  of  the  state  being  split  here  (which  itself  was  the  base 
state  of  a  splitting)  is  no  longer  accessible  once  the  splitting  has  been 
done;  so  it  and  its  attached  predictive  states  are  eliminated  from  the 
machine.  Thus  one  splitting  can  "undo"  the  work  of  a  previous  one.  Our 
latter  results,  especially  Theorem  3.52,  assure  us  that  this  resulting 
MSP(l)  machine  could  have  been  achieved  in  a  less  roundabout  fashion,  by 
making  just  two  splittings  on  the  LR(1)  machine;  rather  than  three,  which 
interfere  with  each  other. 

3.6  Parsing  with  MSP(k>  Machines 

It  is  now  our  purpose  to  show  that  the  language  accepted  by  an  MS?(k) 
machine  for  the  grammar  G  is  precisely  L(G),  the  language  generated  by  G. 

This  will  confirm  our  design  of  MSP(k)  machines  as  alternate  parsing  machines 
for  G.  We  shall  establish  this  rasult  by  showing  that  L(M)  =  L(M.),  where 
Mq  is  the  canonical  LR(k)  machine  for  G.  Since  it  is  well  known  that 
L(Mq)  =  L(G),  this  will  give  us  our  desired  theorem.  The  proof  of  this 
assertion  is  fairly  tedious,  ana  uses  standard  "twin-machine"  proof  techniques 
we  show  how  to  construct  from  m  accepting  sequence  in  Mq,  a  corresponding 
sequence  in  M,  and  vice  versa.  First,  we  need  a  number  of  preliminary 
definitions  and  results  about  the  way  an  LR(k)  machine  operates. 
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Definition  3.52  Let  Mq  be  an  LR(k)  machine,  q  a  noi -final  state  of  M^,  and 
c0»ci»***»cn  80  accepting  sequence  of  configurations  of  Mq  for  the  string  w. 
Then  an  entry  to  £  in  the  sequence  is  any  stack  of  the  form  PXq, 

x  f  VN  L'  VT. 

We  note  that  there  may  be  many  entries  to  q  in  the  course  of  a  parse; 
an  entry  to  q  is  simply  the  occasion  on  which  it  is  transferred  to,  either 
as  the  result  of  a  read  or  after  a  reduction. 


Definition  3.53  Let  PXq  be  an  entry  to  q  in  an  accepting  sequence;  the 
corresponding  exit  from  £  is  the  first  stack  Ct^  after  the  entry,  of  which 
pXq  is  not  a  proper  prefix. 

Since  we  are  dealing  with  an  accepting  sequence  of  configurations, 
the  last  stack  it-  the  sequence  will  not  have  pXq  as  a  prefix,  and  so 
there  is  an  exit  for  each  entry.  The  exit  corresponding  to  an  entry  is 
the  first  stack  in  which  that  entry  no  longer  figures;  that  is,  it  is 
the  point  where  a  reduction  corresponding  to  one  of  the  essential  items  of 
q  is  made. 

Definition  3.54  Let  =PXq  be  an  entry  to  q,  and  the  corresponding  exit. 

We  say  that  the  nonessential  item  A  -♦  .Y(t)  is  recognized  by  this  entry  to 

q  if  there  is  a  stack  a  ,  i  <  m  <  j,  such  that  a  =pXq¥'q'  where  q'  is  the 

m  m 

final  state  for  A  -»  Y  and  T'  consists  precisely  of  the  symbols  of  Y  altema- 
ting  with  state  names  and  where  u  — |  /k  =  t. 


113 


We  observe  that  If  these  conditions  hold  for  a  then  the  next  step 

m  v 

performed  by  Mq  will  be  to  reduce  ¥  to  A;  so  a  ^  *  pXoAq",  for  some  q”. 

We  also  hc-.ve  the  following: 

Lemma  3.55  If  an  entry  to  q  recognizes  the  item  A  -♦  .Y(t),  then  A  -*  ,Y(t) 
is  an  item  of  q. 

Proof  This  follows  readily  from  the  way  an  IR(k)  machine  Operates  and  the 
way  its  states  are  composed.  Suppose  stack  a  is  pXqY'q',  where  Y'  = 

^lq1^2£,2‘ * ‘^r-l^r*  71160  q'  =  f  ^r-l  ,7r ’^m  therefore  q'  = 

f  (qr_^»^r »t)«  Therefore  there  must  be  an  item  A  -♦  Y^Y^. ••7r ^  *n  qr  1* 

Working  backwards  through  Y,  we  see  that  A  -♦  Y^. ,#Y^. . . Y^ (t)  ig  an  item 
of  qi_1.  Thus  A  -*  .Y(t)  Is  an  item  of  q.  Q.E.D. 

Lemma  3.56  Let  -PXq  be  an  entry  to  q;  suppose  A  -*  .Yfr)  is  gome  non- 

essential  item  recognized  by  q,  and  that  B  •*  ,c?(p)  is  the  next  r.on-essential 
item  recognized  w  q.  Then  A  -*  .Y(t)  Is  an  immediate  descendant  of  B  -+  .cp(p). 

Proof  At  the  time  A  -*  .Y(t)  recognized,  the  stack  is  PxqY'q' ,  \  hic’n  then 
becomes  pXqAq".  The  only  way  for  this  A  to  be  removed  from  the  stack  is 
by  the  performance  of  a  reduction  that  exposes  either  q  or  some  state  lower 
down  in  the  stack;  but  this  can  not  happen  before  the  next  non-essential  item 
is  recognized  by  q.  So  when  B  -*  .cp ( p)  is  found,  the  A  is  still  on  the  stack. 
But  since  recognition  of  B  -»  .<p(p)  entails  popping  off  cp  and  exposing  q,  we 
have  that  A  is  the  first  symbol  of  cp.  Thus  the  two  items  are  really  A  *  .Y(t) 
and  B  -*  . Am(p).  To  show  the  fomer  is  an  immediate  descendant  of  the  latter, 
we  need  only  show  t  g  FIRST^Opp). 
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To  show  this,  let  us  let  be  the  remaining  input  at  the  time 

A  -*  .Y(t)  is  recognized  and  the  remaining  input  when  B  -»  . Afp(p)  is 

found.  Observe  that  is  a  suffix  of  u. ,  some  prefix  of  w  call  it  w  , 

*•  1  1  ^ 

has  been  read  between  those  two  recognitions.  That  is,  to,  »  an<* 

what  has  occurred  is  that  has  been  reduced  to  (p  by  the  machine.  By  a 
well-known  result  of  LR(k)  machines,  this  means  that  ffl  =»  w^.  We  know 
that  T  =  gj  /k  and  that  p  *  u^/V.  If  jcc^l  *  k,  then  T  is  a  prefix  of 
and  so  T  f  riK3T^(<p).  If  jw^J  <  k,  we  can  set  t  =  where  is  a  prefix 

of  if>2  of  length  less  than  or  equal  to  k„  Since  P  is  the  prefix  of  of 
length  k,  this  meani  (  ^  r'.s  a  prefix  of  p;  -nee  T  f  FIRST^tcpp).  In  either 
case,  we  are  done,  Q.E.D. 


We  want  to  extend  those  ideas  to  the  concept  of  an  entry  to  a  stete 
recognizing  an  essential  item. 


Definition  3.57  Let  =  pXq  be  an  entry  to  state  q,  anc  the  correspond, ng 

exit.  We  say  that  the  essential  item  A  -♦  cp.Y(T)  is  recognized  by  this  err.ry 

if  there  is  a  stack  a  ,  i<  mi  j,  such  that  a  =  pXqY'q',  where  Y'  consists 

m  ra 

of  the  symbols  of  Y  alternating  with  state  names  and  q'  is  the  final  state 

for  A  -*  5pY  a  ad  w>  ~4  /k  =  T. 

’  fa 


Note  that  if  these  conditions  are  met,  is  obtained  by  removing  all 

of  Y'  and  some  of  PXq  frcaa  the  stack,  thus  exposing  some  state  lower  down  in 
the  stack,  and  applying  A  to  that  state.  In  other  words,  will  be  the 

exit  fror  q,  and  so  it  is  a^.  Thus  m  =  J-l.  We  note  from  this  that  if  is 
the  exiw  corresponding  to  entry  (X^,  then  satisfies  the  above  definition 

as  recognizing  some  essential  item.  Thus  we  have: 
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Lemma  3.58  If  =  pXq  is  an  entry  to  state  q,  then  it  iecognizes  exactly 
one  essential  item  A  -»  cp.'f(T). 

Furthermore,  there  is  the  following  analogy  to  Lemma  3.55. 

L**nnig  3.59  If  an  entry  to  q  recognizes  the  essential  item  A  -*  tp.Y(T), 
then  A  -*  ®.Y(t)  is  an  item  of  q. 

Proof  Similar  to  the  proof  of  Lemma  3.55. 

Lemma  3.60  If  an  entry  to  q  recognizes  the  essential  item  A  -*  cp.'f(T), 
and  if  the  last  non-essential  item  it  recognizes  is  B  JI(P)>  then  B  -»  .Ti(p) 
is  an  immediate  descendant  of  A  -»  cp.  T(t) 

Proof  Similar  to  the  proof  of  Lemma  3.56» 

Lemma  3.fl  The  first  item  recognized  by  an  entry  to  q  is  a  terminal  item. 

Proof  If  ai  =PXq,  ai+,  must  be  pXqCfq',  where  a  is  f  or  an  element  of  VT> 
since  q  is  non-finalj  thus  by  definition,  u  will  be  the  first  symbol  of  the 
first  item  found  by  '  lis  entry.  Q.E.D. 

Lemma  3.62  If  =pxq  is  an  entry  to  state  q,  and  A  "*  .  ccp(t)  ia  the  first 
item  recognized  by  the  entry,  then  \  ^/k  f  FIRST^ (crpT) . 

Proof  Similar  to  the  second  half  of  the  proof  of  Lemma  3.56. 

Theorem  3.63  Suppose  Ci^  is  an  entry  to  state  q,  and  that  “i  /k  f  L. 
Then  the  items  recognized  by  (1^  form  an  L-chain  through  the  state  q. 
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Proof  By  the  preceding  lemmas  and  the  definition  of  an  L-chain.  Q.E.D. 

This  is  the  first  inte*mediate  result  toward  which  we.  have  been  heading. 
We  have  succeeded  in  formalizing  the  notions  we  discussed  earlier,  of  how 
an  LR(k)  state  is  used  during  the  course  of  a  parse,  and  have  shown  the 
utility  of  our  notion  of  a  chain.  We  all  proceed  to  apply  this  result 
in  order  to  show  that  L(Mq)  c  L(M). 

Definition  3.64  Let  Mq  be  the  LR(k)  machine  for  G,  M  some  MSP(k)  machine  for 
G.  Suppose  u  f  L(Mq),  and  c^.c^,...,^  is  the  accepting  sequence  of  config¬ 
urations  of  Mq.  Then  we  define  a  sequence  of  configurations  of  M,  h(c^),..., 
h(cn>,  as  fellows:  h(c^)  is  the  initial  configuration  of  M  for  to;  h(ci+^) 
is  the  first  succesor  configuration  of  hCc^)  whose  state  is  neither  POP  nor 
an  initial  state. 

W’e  want  the-  sequence  h(c  )...h(c  )  to  keep  track  of  where  M  is  in  the 

in  0 

sequence  ci"**cn*  Predictions  and  suspensions  are  extra  steps  done  by  M 
when  compared  with  Mq,  so  the  results  of  these  operations  are  skipped  over. 

As  of  yet,  we  have  no  assurance  that  h(c^)  is  well  defined;  but  if  it  is, 
its  state  is  either  a  base,  an  intermediate,  or  a  final  state.  We  shall 
show  that  this  state  of  h(c^)  can  do  exactly  what  the  state  of  c^  is  called 
upon  to  do  in  this  accepting  sequence. 

Definition  3.65  If  t  is  a  stack  level,  then  i  is  a  string  of  the  form 

A  qQ  Xj  q^  q^  . . .  x^  q^  B,  where  n  ^  0;  then  CON(i),  the  contents  of  6, 

is  the  string  x,x^...x  .  If  t  is  a  topmost  stack  level,  then  t  is  of  the 
1  l  n 

form  AqyX^q^...  x  q^;  then  CON(t)  is  the  string  x^-.-x^.  If  a  is  a  stack, 
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then  a  -  i  &  t2  A  ...  A  iffiA  t;  LIN(a),  the  linearization  of  .  ,  is  the 
otring  OON(i^)  *  CON^)  •••  CON(£ffi)  •  CON(t),  the  concatenation  of  the 
contents  of  its  stack  levels. 

The  contents  function  ignores  the  state  names,  as  well  as  the  names 
of  the  nonterminals  predicted  on  any  level. 


Proposition  3.66  If  u  f  L(M(.),  M  is  an  MSP(k)  machine  for  G,  and 

h(c  )  is  defined  as  above,  then  for  each  i,  1  ^  i  ^  n: 

n  ^ 

i)  h(c1)  is  a  well-defined  configuration  of  M,  (q'.j^V'V  H  ) 

ii) 

iii)  LIN^)  =  LXN(ai') 

iv)  if  is  a  final  state,  qJL'  is  the  same  final  state 

v)  if  qj  is  non-final  then  qt '  c:  q^  furthermore,  1  ^  q^  » 
where  I  is  the  item  of  q£  recognized  by  C^,  which  is  an 

entry  to  qA 

Proof  We  shall  prove  this  by  induction  on  i.  The  last  two  clauses  of  the 
proposticr.  are  the  ones  that  "carry"  the  proof,  making  the  others  immediate. 

Basis  i  =  1;  This  is  straightforward  by  the  definition  of  hfc^. 


Induction  Suppose  the  statement  is  true  for  i  *  m;  .re  shall  show  :.t  true 
for  i  =  m+1,  Sinre  MQ  is  an  LR(k)  machine,  c^  can  be  either  a  read  or 

a  reduce  successor  of  c  . 


Let  us  first  consider  the  case  where  is  a  read  successor  of 

In  this  case,  q^  must  be  an  intermediate  state,  and  q^  is  a  base  or 
intermediate  such  that  ^  c  v  In  going  from  cm  to  c^.  MQ  reads  the 


first  symbol  (call  it  a)  from  the  remaining  input  ^  As  we  saw  in  Lemma 
3.61,  the  first  item  recognized  by  c^,  an  entry  to  qjn,  will  be  an  item 


A  -»  .hT(t);  and  by  Theorem  3.63,  this  item  will  be  a  descendant  of  I,  the 
essential  item  recognized  by  c 

m 

If  qm'  is  an  intermediate  state,  since  I  p  q^',  A  ■+  .aY(r)  will  be  a 
member  of  qm‘  as  well,  since  it  '.s  in  the  completion  of  I.  There  may  be 


other  .a  items  in  q^'  as  well,  but  they  will  all  be  in  q^^  too.  So  since 

fM  13  defined,  where  «P  =  — I  k/k+l,  f^q^'.a.p)  will  also  be 

•  well-defined  state.  Thus  machine  M,  when  in  the  configuration  with  q^' 

as  the  state  and  remaining  input  w  1  =  u  will  have  a  well-defined 

n  id 


successor,  gotten  by  reading  the  symbol  a  from  the  input.  Thus  h(c 

Hhrl 

is  well-defined.  We  had  u  =  <J  both  headed  by  an  a:  u  „  and  <j'  . 

mm’  m+1  m+1 

are  gotten  from  their  predecessors  by  removing  the  first  symbol,  so  w  = 

IOtI 

u'  as  well.  Furthermore,  a  is  the  same  as  a  ,  except  for  the  addition 
m+1  m+1  m  r 

of  the  symbol  a  and  one  more  state  name;  since  an  analogous  statement  is 

true  for  a'  ,  and  since  LIN(Ct  )  =  LIN(a'),  we  have  LIN  (a  )  =  LIN(a'  ,). 

m+1  m  in  dhi  m+1 

If  q^^  is  not  a  final  state,  then  the  essential  item  it  recognizes  is 
is  A  ■+  a.1ir(T)>  which  by  construction  will  be  in  qj^  too.  Furti.^imore,  since 

p  successor  of  a'  is  contained  in  the 


a/p  successor  of  q^.  If 

be  the  same  state,  and  so  thi3  sub-case  is  done. 

Now  we  must  consider  the  possibility  that  q^'  was  a  base  state,  rather 

than  an  intermediate;  this  is  still  under  the  assumption  that  c  ^  is  a 

read-successor  of  c  .  Let  (B,Q)  be  the  splitting  of  which  q^'  is  the  base 

k 

state,  where  Q  =  {(A^,P^)}.  There  are  two  possibilities:  either  — |  /k 
is  an  element  of  the  predictive  language  of  A^,  for  some  i;  or  it  is  in  none 
of  the  predictive  languages.  If  it  is  in  the  predictive  language  for  the 
nonterminal  A^,  thet'  the  first  Item  recognized  by  q^  which  we  shall  call 


q^^  is  the  final  state  for  A  -*  a,  then  q^^  will 


q^1  c  q^,  it  is  inmediate  that  the  a/ 
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A  -+  .  a’i'(T),  is  an  L^(A^)  terminal  item  and  also  a  descendant  of  I,  the 
essential  item  recognized  by  the  am  entry  to  q^.  Since  I  is  in  q^',  the 
base  state  of  the  splitting  (B,Q),  and  since  A  -*  .a¥(T)  is  a  descendant 


of  I,  this  item  will  have  to  be  in  one  of  the  components  of  the  splitting. 
Since  it  is  an  L^CA^)  “  terminal  item,  it  will  be  in  P^,  the  predictive 

state  for  A^  Now  consider  the  actions  of  M  after  entering  configuration 

^  i  lc 

h(cm).  The  lookahead  is  — %  /k;  by  hypothesis,  this  equals  — f  Tc  , 

which  is  ii  the  A^’pr  edLctive  language.  So  since  q^ 1  is  the  base  state  of  the 

splitting  (B,Q),  M  will,  then  create  a  new  level  and  predict  jumping 

to  state  P^.  The  state  P^  is  a  predictive  state,  so  even  though  Mc^) 

has  a  successor  c',  we  can  not  let  it  be  1 (c  ^).  But  now  observe  the  next 


action  of  M.  It  is  in  state  P^,  which  is  a  predictive  state  associated  with 
base  q^'.  Since  q^'  C  q^,  we  have  c  as  well.  The  item  A  -♦ 


is  in  q^  as  well  as  in  P^.  Since  it  is  the  first  item  recognized  by  q^, 

k 

fM  (q^.a,?)  is  well-defined,  where  ao  — |  /k+1;  so  is  also 

well-defined.  But  ap  are  exactly  the  k+1  symbols  of  lookahead.  M  will  see 


when  it  jumps  to  P^.  Thus  this  configuration  c'  has  a  successor  as  well, 
which  it  gets  to  by  ;  wading  the  symbol  a  onto  the  stack  and  following  the 
a/ p-transition  out  of  P^.  This  configuration  will  be  by  definiti  n  Mc^j). 
and  it  is  easy  to  see  that  it  satisfies  all  the  required  properties,  as 


follows. 


Either  q^^  is  the  final  state  for  A  +  a  or  the  essential  item  it 
recognizes  is  A  -*  a.Y(T);  in  the  first  case,  is  the  sam"  state,  while 

in  the  other  q^^  contains  A  -*  a.Y(T)  as  well.  In  the  latter  case,  since 
q^j  is  the  a/p-successor  of  P^,  which  is  contained  in  q^,  we  have  q^^'  C 
q^^,  which  is  the  a/p-succes3or  of  q^.  The  remaining  inputs,  an^ 
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^ro+i'*  are  same>  since  they  are  gotten  from  and  w^1  by  eliminating 

the  first  symbol.  Finally,  difft  j  fnrn  (X^  in  having  the  symbol  a  and 

a  new  state  name;  a  differs  from  a  1  in  having  a  new  stack  level,  with 

the  predicted  nonterminal  written  there  and  on  the  previous  level,  and  the 

symbol  a  and  two  state  names  on  the  top  level.  But  the  only  pait  of  this 

fhat  effects  LIN  is  the  symbol  a;  that  is,  LIN(a  =  LIN(Oc.)  •  a  ^<5 

LINO?.  ')  **  LIN  (a  ')  •  a.  Since  LIN(<X  )  =  LIN(a,  '),  we  are  done  with  this 
m+1  m  m  m 

case. 

The  other  possibility,  still  assuming  t't  it  c  ^  is  a  read  successor  of 

c  and  that  q  1  if  a  base  state,  is  that  u  — {  ^/k  is  not  in  any  predictive 
m  td  ra 

language  of  the  splitting  (B,Q),  of  which  q^'  is  the  base.  In  this  case, 

the  first  item  recognized  by  is  a  descendant  of  I  which  will  also  be  in 

the  base  state,  ;  and  so  f^q^'  ,a, P)  will  be  well-defined.  The  successor 

of  h(c  )  will  be  well-defined,  and  it  will  be  h(c  ,);  the  necessary  eondi- 
m  dr* 

tions  are  verified  just  as  we  have  done  above. 

This  completes  the  first  half  of  our  proof,  for  the  case  where  is 

a  read-successor  of  c  .  We  note  that  the  above  proof  is  also  valid  where  the 

m 

symbol  being  rt ad  is  c.  i.e.,  where  the  first  item  recognized  by  cffl  is 

A  *  or  where  the  first  item  recogni^  •'  by  c^  Is  an  essential  item. 

Now  we  must  consider  the  case  where  c  ,,  is  a  reduce  successor  of  c  . 

m+1  m 

In  this  case,  c  must  be  the  final  state  for  some  rule  A  -*  Y,  so  h(c  )  must 
’  tn  ro 

be  the  same  state,  by  induction.  The  configuration  c^^  will  be  obtained  bv 
popping  Y  and  the  interleaved  state  names  off  the  stack,  exposing  some  state 
q,  and  applying  A  to  q.  The  stack  a  will  be  of  the  form  pXqAq^.  Now 

there  must  have  been  some  earlier  entry  to  this  state  q;  let  CC,  be  the  last 
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previous  stack  equal  to  8Xq.  In  other  words,  is  an  entry  to  state  q, 

and  is  the  recognition  of  one  in  the  sequence  of  items  that  c^  will 

recognize  before  its  exit.  By  Hieorem  3.63  this  sequence  forms  a  chain 

i  k 

cf  Items,  the  item  A  -*  .Y(io  — f  /k)  is  one  in  this  chain.  We  can  construct 

m 

the  part  of  this  chain  below  this  item  by  looking  back  at  all  configurations 

Cj ,  where  i  K.  j  (,  m,  such  that  =  pXq^'qj*  where  q^  is  the  final  state  for 

i  Jc  * 

B  -*  ep;  in  that  case,  B  -*  .©(i*^  "H  /k)  is  some  i*"em  in  the  chain  which  follows 
2  k 

A  -*  .Y(u>  /k).  All  of  these  items  are  descendants  of  I,  the  essential 

m 

item  recognized  by  the  entry  to  q^  at  c^. 

We  have  seen  that  at  configuration  h(cm),  machine  M  is  in  the  final 

state  for  A  *♦  Y.  By  the  way  M  is  designed  to  operate,  it  will  try  to  pop  Y 

off  the  top  level  of  its  stack,  expose  some  state  name,  and  apply  A  to  that 

state.  We  have  to  show  that  H  can  successfully  complete  this  task. 

By  the  way  an  MSP(k)  machine  operates,  it  can  enter  the  final  state  for 

A  -*  Y  only  if  all  the  symbols  of  Y  are  on  the  topmost  stack  level.  So  v 

will  be  able  to  pop  Y  off  *-he  3lack;  but  what  state  will  this  expose?  Let 

us  examine  what  has  happened  to  the  stack  of  M,  in  between  configurations 

h(c^)  and  h(cm).  By  the  inductive  hypothesis  and  what  we  have  already  shown, 

we  see  that  M  performs  all  of  the  reads  and  reductions  that  Mq  does;  the  only 

difference  between  the  operations  of  the  two  machines  before  c  is  that  M  may 

m 

ha\  mav.i  and  suspended  some  predictions.  We  are  not  interested  in  anv  predic¬ 
tions  that  M  may  have  made  after  h(c^)  that  have  been  fulfilled  before  h(cm); 
but  let  us  suppose  that  M  made  a  ~ rediction  at  h(c  )  i  (  j  K,  in,  that  has  not 

J 

ye*-  been  fulfilled  at  n(c  But  if  this  is  to  be  the  case,  all  of  Y  must 

m 

be  on  the  topmost  l;vel  created  at  hfc^);  but  the  first  symbol  of  Y  has  to 
be  on  the  sane  level  as  q^,  if  popping  off  Y  is  to  reveal  q^. 


Sc  either  no 
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unfulfilled  predictions  have  been  made  between  h(c.)  and  h(c  ),  or  exactly 

l  m 

one  had  been  made  at  h(c^)  •  In  the  former  case,  the  state  exposed  when  M 
pops  off  Y  at  h(cm>,  will  be  none  other  than  q^,  the  state  at  h<‘c^);  in  the 
latter  case,  q^'  is  a  base  state,  and  the  state  exposed  by  popping  off  Y 
is  some  predictive  state  associated  with  it. 


Let  us  consider,  then,  what  happened  at  (c^).  There  are  several 
possibilities : 

t- 

i)  That  q^'  was  an  intermediate  state.  We  know  that  A  -♦  . Y(u)  -H  /k), 
the  item  recognized  at  cm,  is  a  descendant  of  I,  the  essential  item  recog¬ 
nized  by  c,.  Let  B  -*  .A<p(T)  be  the  next  item  recognized  by  c,  after  c  . 

l  i  m 

Then  B  *♦  .Aq>(T)  is  also  a  descendant  of  I,  and  A  -»  .Y(u;  — J  ^/k)  is  its 

in 

immediate  descendant.  By  hypothesis ,  I  c  q^';  therefore,  since  q^'  is  the 

completion  of  its  essential  items,  A  -*  . Y —4  /k)  and  B  -»  .A^'.t)  are 

both  in  q^'  as  well.  When  Y  is  popped  off  the  stack  at  h(cm),  q^ '  is  exposed, 

k  k 

and  the  lookahead  is  —I  /k;  since  f\.(q,  '  »A,i»  — |  /k)  is  well-defiued 

m  M  l  m 

(since  q^'c  q^  and  3inco  B  -*  .Acp(T)  is  in  q^ ' ,  with  H  £  FIRST^CcpT)), 

h(c  )  will  thus  have  a  well-defined  succesor  configuration.  Its  state  will  be 
m 

f(q^',A,<Jm  —I  /k).  If  <p  I6  € ,  then  since  B  -*  .Ao(t)  is  the  next  item  tc  be 
recognized  by  c^,  B  -*  A.cp(T)  will  be  the  essertial  item  recognized  by  c^^; 


but  this  item  will  be  in  qj^  as  we’l;  and  since  qj^ 1  c  q^,  will  be 


contained  in  q  If  =  c,  then  both  q^^ 


will  be  final  states 


for  B  -♦  A.  The  other  properties  of  hCc^^  are  easy  to  verify. 


ii)  The  other  possibility  is  that  q^'  was  a  base  state,  say  of  the 
splitting  (B,Q).  There  are  two  possibilities  to  consider  here:  that  the 
lookahead  the  time  of  entry  to  q^  was  in  some  predictive,  language, 

or  that  it  was  in  none. 
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t  k 

a)  Suppose  |  /k  was  not  in  any  predictive  language.  Since  I, 

the  essential  item  recognized  by  c^,  in  in  q^‘,  every  item  recog¬ 
nized  by  c^  will  be  in  some  component  of  the  splitting  (B,Q).  Any 
item  A  -*  .Y(t)  recognized  by  c^  will  have  H  \'k  <r  FIRST.  (Yt). 
Since  — j  k/k  wa8  not  in  any  predictive  language,  this  means 

that  any  item  A  -*  . Y (t)  recognized  by  c^,  will  be  in  the  base 

of  the  splitting,  q^',  by  the  definition  of  the  base  of  a  splitting. 

At  configuration  h(c^),  since  — j  ^/k  =  {  k/k  was  not 

in  any  predictive  language,  M  will  not  have  made  a  prediction. 

So  when  it  enters  the  final  state,  at  h(cm),  for  A  •*  Y,  it  will 
pop  off  Y  and  expose  the  state  q^'.  Using  the  same  arguments  as 
we  did  before,  the  item  B  •*  ,  A£p(T)f  corresponding  to  the  next  item 
after  A  -*  — I  M  found  by  c^,  will  also  be  in  q^'.  Thus, 

since  C  q^,  f^(q^ 1 ,A,Wm  — ♦  ^/k)  is  well-defined,  and  satisfies 
all  the  required  properties. 

b)  Now  suppose  — J  /k  was  an  element  of  some  predictive  language 

of  the  splitting  (B,Q),  say  for  A^.  Then  at  Mc^),  M  would  have 

predicted  an  A  and  started  a  new  level  to  find  an  A  .  Intutitively , 

J  J  ,  k 

there  are  three  possibilities:  by  the  time  A  .Y(u  — |  '  /k)  is 

in 

recognized,  an  has  already  been  round  and  the  prediction  ful¬ 
filled;  an  A^  has  not  yet  been  feund;  or  A  =  A^ ,  and  the  reduction 
of  Y  to  A  fulfills  the  prediction.  We  shall  see  what  causes  each 
of  these  three  possibilities. 


Let  us  consider  the  sequence  of  items  recognized  by  between  c^ 

and  c  .  As  we  have  seen,  these  items,  when  taken  together  with 
tn 

the  rest  of  the  items  found  "qy  c^  after  cffl,  form  an  L^(A^)- terminal 

chain,  starting  with  I,  the  essential  item  found  by  c^.  Since 

q  '  Cq^,  this  chain  is  also  a  chain  through  the  state  of  which 

(B,Q)  is  a  splitting;  and  so  tuere  must  be  an  A^-item  in  this  chain. 

Let  vis  see  whether  or  not  there  has  been  an  A^-item  among  those 

recognized  by  c.  before  c  . 

i  m 
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If  has  not  yet  recognized  an  A^-item,  then  M  has  not  found  an 
Aj  yet  either  after  predicting  one  at  hfc^);  so  the  prediction  that 
has  been  made  at  h(c^)  has  not  yet  been  fulfilled;  thus  the  state 
that  will  be  exposed  when  Y  is  popped  off  at  h(cm)  will  be  the 
initial  state  P,.  Furthermore,  since  no  A  -item  has  yet  been  found, 

^  i  k 

all  items  found  up  till  now,  including  A  -»  .Y(ca  — {  /k),  will  t>e 

m 

"below"  the  A^-item  in  the  chain,  and  so  will  be  in  P^.  Let  us 
consider  what  happens  when  Y  is  popped  off,  exposing  P  ^ ,  and  we  try 
to  apply  A  to  it.  If  A  t  Aj ,  then  the  next  item  to  be  found  by 
Cj  ,  B  -♦  . Acp(T) ,  will  also  be  in  P  ^ ;  so  since  P^  C  f^ip  A,wm— ■l  k/k) 
will  be  well-defined,  as  will  be  h(c^^),  with  all  the  required  pro¬ 
perties. 

If,  however,  A  =  A  ^  ,  it  may  be  that  we  have  just  fulfilled  the  pre¬ 
diction  made  at  h(c^);  however,  it  is  also  possible  the*-  the  A^ 
found  is  a  lower- level  one,  and  does  not  fulfill  the  prediction. 

By  definition,  each  L^CA^)- terminal  chain  can  be  broken  into 

and  H_.  So  consider  the  chain  corresponding  to  the  items  recog- 
*  u 

nized  by  c^-  Either  the  item  Aj  -*  ,Y(u^  -H  /k)  is  the  first 

item  of  H2  or  it  Is  not.  If  It  Is  not,  then  the  next  item  in  the 

chain,  B  -*  .AjCd(t),  Is  also  In  P^;  and  so  f^(Pj  ,A^  ,0)^  — )  /k)  is 

a  well-defined  state,  and  we  are  done.  However,  if  it  Is  the  first 

Item  of  H_,  then  the  next  item  in  the  chain,  B  -*  ,A.®(T),  is  in  the 

base  state  q^';  and  by  definition,  f^CP^  ,A^  — |  /k)  will  be 

equal  to  POP.  So  after  exposing  P, ,  h(c  )  will  transfer  to  POP, 

*  j  m 

wiping  off  the  top  stack  level,  and  exposing  q^'  as  the  state  to 
which  it  tries  to  apply  A^.  Since  B  -*  .Ajtp(T)  is  the  next  it  m  to 
be  found  by  c^,  and  since  it  is  also  in  q^',  h(c^^)  will  be  well- 
defined  and  have  the  desired  properties. 

Finally,  we  must  consider  the  case  where  c^  found  an  A^-item  before 

c  .  We  again  consider  whether  or  not  that  A  -item  was  the  first 
in  J 

element  of  Hj,  of  the  chain  of  items  found  by  c^-  If  not,  then  as 
we  have  already  seen,  the  prediction  made  at  h(c^)  would  not  yet 
have  been  fulfilled;  and  so  the  state  exposed  by  popping  off  Y 
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would  still  be  Pj,  and  the  analysis  would  be  as  above.  If  however, 

the  Aj-item  already  found  was  the  first  item  of  Hj*  the  prediction 

would  have  been  fulfilled.  Thus  when  Y  is  popped  off  at  h(c  ), 

i  k 

the  base  state  q^ '  will  be  exposed.  The  item  A  .Y(<*)m  '  /k)  as 

well  as  the  next  item  in  the  chain,  B  -*  .Atp(T)»  will  both  be  in 
of  the  chain,  and  hence  in  q^'.  Thus  after  exposing  q^',  M  will 
be  able  to  apply  \  to  it,  since  f(q^',  A,  |  ^/k)  is  well- 
defined.  Thus  h(c  la  well-defined,  and  it  is  easy  to  see  that 
it  satisfies  the  required  properties. 

This  completes  the  final  subcase  of  the  analysis,  and  establishes  the 

stated  result.  Q.E.D. 

Theorem  3.67  L(MQ)  c:  L(ri). 

Proof  Suppose  to  f  L(Mq),  and  let  Cj,...,cn  be  its  accepting  sequence. 

Then  by  the  foregoing  result,  there  is  a  sequence  of  configurations  of  M: 

k 

h(c,),... ,h(cn).  By  definition  is  the  configuration  (q,  S  qQ  S  q,  — 1  ). 

According  to  the  preceding  theorem,  LINCa^)  =  LINCa^').  Since  an  initial 

state  can  not  be  a  base  state,  it  follows  that  every  level  of  an  MSP(k) 

stack  a  must  add  at  least  one  symbol  to  LIN(cr).  Therefore  since  LIN(cO  =  S, 

it  must  be  that  a  '  has  only  one  level.  This  level  must  be  of  the  form 

S  qQ 1  S  qj  where  q'  =  ^).  Thus  q1  =  POP,  and  Mc^)  is  the 

k 

configuration  (POP,  S  q^'  S  POP,  — j  ),  since  In  each  case, 

we  either  had  hC^)  h(ci+1>  or  h^)  k  c  *  | —  h(ct+1);  in  anY  event» 

we  have  h(cj)  h(cn);  since  h(c^)  is  the  initial  configuration  of 
for  w,  this  means  that  w  f  L(Hq).  Q.E.D. 


mmk# 
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We  shall  now  show  the  converse  of  the  preceding  result,  and  prove  that 
L(M)  C  L(Mq),  that  any  string  recognized  by  an  HSP(k)  machine  for  G  Is  Iso 
recognized  by  the  LR(k)  machine  for  G.  We  again  use  a  twin  machine  approach, 
but  the  basic  lenxna  is  much  easier  to  obtain  in  this  case.  Wr  ^o  not  need 
to  make  use  of  all  the  details  of  the  definition  of  a  state-splitting,  but 
largely  only  one:  that  each  component  of  a  splitting  is  a  subset  of  the 
State  being  split. 

Definition  3.68  Let  M  be  an  MSP(k)  machine  for  G.  Suppose  w  p  L(M),  and 

d. ,  d- . d  is  the  accepting  sequence  of  configurations  of  H  for  w.  Then 

we  define  a  sequence  of  configurations  of  Mq,  h(d^),  h^) ,. . .  ,h(dn>,  as 
follows:  h(d^)  is  the  initial  configuration  of  MQ  for  w;  if  d^+^  is  a  read 
or  reduce  successor  of  d^,  then  h(d^+^)  is  the  successor  of  h(d^);  otherwise 

h(di+l)  “  h(di)* 

We  want  the  sequence  of  configurations  of  Mq  to  keep  track  of  what  M 
is  doing.  The  performance  of  a  prediction  or  a  suspension  by  M  has  no 
counterpart  in  activities  by  Mq,  so  in  that  case  Mq  does  not  have  to  change 
its  configuration  in  order  to  keep  up  with  M. 

Lenina  3.69  If  €  L(M),  with  accepting  sequence  d^.d^,. . .  ,dn,  and  with 
h(d^) ,. . . ,h(dn)  defined  as  above,  then  for  each  1,  1  s  i  £  n: 

i)  h(d^)  is  a  well-defined  configuration  of  ^1  >ai  '  ,U)i '  ^ 

ii)  u>t  =  u1‘ 
ill)  LIN  (G.^ )  =  LIN^') 

iv)  if  qt  is  the  final  state  for  i  -*  Y,  qt'  is  the  same  final  state 
v)  if  q^  is  not  a  final  state,  fh  either  is  q^',  and  c 
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Proof 

Baals  1  =»  i  immediate. 

Induction  Suppose  the  statement  Is  true  for  1,  1  £  1  <  m;  we  shall  show 
It  Is  true  for  m+1. 


Since  M  Is  an  MSP(k)  machine  for  G,  there  are  four  possible  relation¬ 
ships  between  d^  and  d^^:  d^[  ^  can  be  a  read,  a  reduce,  a  prediction,  or 

a  suspension  successor  of  d  .  We  consider  each  In  turn. 

m 

i)  Suppose  d  ,  is  a  read  successor  of  d  .  That  is,  w  =  ap, 

q^^  =  a,  P  — 1  /k),  and  -  P*  This  can  be  the 

case  only  if  there  is  some  item  A  -*  ®.  aY(T)  in  state  q^,  with 

pH  k/kf  FIRS^OFT).  But  if  this  item  is  in  q^,  it  is  also 

in  q^',  since  q^'  C  by  hypothesis.  Since  q^  is  a  state  of 

a  well-formed  LR(k)  machine,  f^Cq^',  a,  p  —j  /k)  is  thus  a 

single-valued,  well-defined  state.  But  at  configuration  h(dm>, 

the  next  symbol  of  input  is  a  and  the  following  lookahead  is 

p  — f  k/k.  So  h (d  )  is  a  well-defined  read  successor  of  h(d  ). 

in-rl  t  in 

Since  w  =  aw  and  u  '  ^  aw  we  have  w  =  w'-.  Also 

m  m+1  m  mfl  ’  u+1  m+1 

LINCa^)  and  LINCa^^)  are  obtained  from  their  identical  prede¬ 
cessors  by  appending  an  a,  so  they  are  identical.  Finally,  if 
Y  =  €,  then  both  q^^  and  qj^  are  the  final  state  for  A  -»  ©a; 
if  Y  p,  then  the  essential  items  of  q^^  will  be  contained  in 
the  essential  items  of  qj^  ,  since  all  .a  items  of  q^  are  in  q^. 
Then  since  q^^  is  the  completion  of  its  essential  items  while 
q^^  (which  is  either  an  intermediate  or  a  base)  is  contained  in 
the  completion  of  its  essential  items,  we  have  qm+l  C  Vfl* 

ii)  Suppose  d^^  is  a  prediction  successor  of  d^.  Then  q^  is  a  base 


state  and 


is  an  associated  initial  state;  w 


w  and 
m 


&  ,  is  the  same  as  Cl  except  for  the  addition  of  a  new  stack  level. 

m+1  m  v 

By  definition,  hCd^)  =  h(dm).  So 

LIN (a1  .)  =  LIN(a')  =  LIN (a  )  =  LIN(a  ),  the  last  equality  holding 
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because  adding  a  new  level  to  a  stack  does  not  affect  Its  lineari¬ 
zation.  Finally',  since  q^^  is  a  predictive  state  associated  witl 
the  base  state  q^,  we  know  that  Is  contained  in  the  completion 

of  the  essential  items  of  q^;  since  q^  c  q^,  and  q^  is  an  intermed¬ 
iate  state  and  hence  the  completion  of  its  essential  items,  we  have 
Vfl  c  q*'.  By  definition  =*  qm\  and  hence  q^  C  q^,  and 

we  are  done. 

iii)  Suppose  is  a  reduction  sucessor  of  dm«  In  this  case  is  a 

final  state,  say  for  A  -*  Y.  In  going  from  d  to  d  the  machine 
M  will  pop  Y  off  its  topmost  level,  exposing  some  state,  and  will 
then  apply  A  to  this  state.  Suppose  that  =  pXqAq^.  ,  where 

q  is  the  state  exposed  by  popping  off  the  Y.  Le*.  be  the  last 
stack  before  <X^  which  was  equal  to  pXq;  by  the  operation  of  M, 
it  is  easy  to  see  that  there  must  have  been  such  a  stack.  That  is. 


Now  let  us  consiner  what  happens  to  machine  MQ  at  configuration 

h(d  ).  It  has  entered  the  final  state  for  A  -♦  Y,  and  would  like  to 
m 

pop  Y  off  the  stack.  Since  LIN(a  )  =»  LIN(a  ')  and  a  had  Y  on  its 

m  m  m 

topmost  level  to  pop  off,  it  must  be  that  Y  can  be  popped  off  a  ' 

m 

as  well.  After  Y  is  removed,  some  state  q'  is  exposed,  and  Mq  tries 
to  apply  A  to  chis  state.  We  claim  that  q'  is  precisely  the  3tate 
q^',  the  analogue  of  the  state  M  exposes  when  it  pops  off  Y.  This 
is  shown  as  follows:  since  simply  by  popping  off  Y  at  d^,  M  exposes 
the  state  q^.,  M  can  not  have  made  any  unfulfilled  predictions  be¬ 
tween  d,  and  d  .  By  the  inductive  hypothesis,  between  h(d  )  and 

1  TH  U  1 

h(d  )  precisely  mimics  the  read  and  reduce  activities  of  M  between 
m 

d.  and  d  .  Thus  the  topmost  level  of  a  records  the  reads  and 
i  ja  m 

reductions  that  have  been  done  to  q^,,  while  the  upper  regions  of 

a  '  do  the  same  for  q  '.  Thus  if  the  net  effect  of  going  from  d 
m  l  l 

to  d  is  to  put  Y  on  top  of  q . ,  the  effect  of  going  from  h(d.)  to 
m  i 

h(d  )  is  to  place  Y  on  top  of  c, '. 
m  i 
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When  q^'  is  exposed  by  Mq  at  h(dm>,  Mq  tries  to  apply  A  to  it. 

When  q^  is  exposed  to  M  at  d^,  M  successfully  applies  A  to  q^, 
with  lookahead  u)  .  -4  1  *Vk.  There  art  three  possibilities  for  the 

IB 

nature  of  this  success.  If  there  is  some  item  B  -*  y.Atp(T)  in  q,  , 

with  H  */*  t  :77J>.STjt(q)T)  and  Cp  ^  g,  so  that  ^(q^  A,  ^  —4  k/k) 

is  a  non-final  state,  then  since  q^  c  q^',  ^is  *A  item  and  all 

others  in  q  will  be  in  q  '  as  well.  Since  Mq  is  an  LR(k)  machine 

W.‘.  w  1  — J  /k)  will  be  a  single  well-defined  non-final 

state  of  Mq,  which  will  contain  all  the  essential  items  of  q^ 

and  therefore  all  of  q^  .  Since  ^  ,  Md^^)  is  thus 

well-defined.  Similarly,  if  there  is  an  item  B  -*  y*A(t)  in  q^, 

it  will  be  in  q^'  as  well,  with  q^  and  both  being  the  final 

state  for  B  -*  yA.  In  both  these  cases,  w'  =  to  '  *  id  =  u 

1  ’  m+1  m  m  nri-1  ’ 

since  the  input  is  not  affected  by  the  performance  of  a  reduction; 
and  also  in  both  cases,  the  stacks  are  identically  affected  by 
*:he  removal  of  Y  and  its  replacement  by  A. 


There  is  another  way  for  M  to  be  able  to  apply  A  to  q  .  That  is 

if  q^  is  an  initial  state  for  A,  and  ^(q^,  A,  — |  /k)  =  POP. 

Buw  if  q^  is  an  initial  state,  q^  ^  is  its  asrocxated  b^se  state. 

By  the  construction  of  MSP(k)  machines,  if  f^Cq^,  A,  u  — j  ^/k)  = 

POP,  there  is  some  item  B  -♦  y»&P(t)  in  q^_^,  with  — |  ^/k  f 

FIRST^(cpT ) .  Since  q^  was  a  prediction  successor  of  q^^,  it  follov/s 

that  1  =  q^_1;  and  since  by  induction  q^_^  c  q^_p  we  ^ave 

q  ,  cq,'.  Thus  the  item  B  -♦  y- Ac£>(”t)  is  in  q.',  and  so  fM  (q  '  , 
i"l  I  .  l  q  l 

A,  (J  '  — i  /k)  is  defined,  since  ■<*>'.  Thus  h(d  )  has  a 

’  m  mm,  m 

successor,  with  Ci =  W* A>  H  /k)'  Since  Vi 13  the 

POP  state,  we  do  not  have  to  establish  any  relationship  between 

Vfl  and  Cl-  Since  Cl  =  C  and  Cl  =  C  we  have  Cl  = 

u)'  .  Finally,  in  both  machines,  the  new  stack  is  obtained  from 

m+1 

the  old  one  by  replacing  Y  by  A,  so  LIN(a^^)  =  Llh’Ca^^). 


iv)  Finally,  let  us  suppose  that  d  is  a  suspension  successor  of  d^; 
i.e.,  dm  =  POP.  The  only  way  for  to  be  the  POP  state  is  if 

is  the  final  state  for  some  rule  A  -♦  Y,  where  A  was  the  non¬ 
terminal  predicted  at  the  last  unfulfilled  prediction,  say  at 
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configuration  d^.  Then  as  we  have  seen,  the  discovery  of  the  A 
at  dm_^  involved  popping  off  Y,  exposing  state  q^,  and  then  apply¬ 
ing  A  to  q,  to  reach  the  POP  state,  q_.  In  going  from  d  to  d 

i  m  m  m+1 

then,  M  wipes  off  the  top  level  of  Ct^,  exposing  state  q^  the 
base  state  of  the  prediction  that  has  just  been  fulfilled;  then 

Vfl  "  Vqi-1’  A»  “m 

By  definition,  if  q  is  POP,  then  h(d _ . )  =  h(d  );  thus  as  we  have 

m  m+i  m 

seen,  qj^  =  q^'  =  A»  wm'  H  k/k).  But  since  d^  was  a 

prediction  successor  of  d^_^,  we  know  that  hfd^)  =  h(d^  ^);  hence 

q^_j  Therefore,  since  by  induction,  if  f^(q  , 

A,  w  H  /k)  is  defined,  so  is  fM  (q  ' ,  A,  w  '  — J  k/k),  and 
in  *  *q  1  m  ' 

the  former  will  be  a  subset  of  the  latter,  unless  they  are  both  the 

same  final  state.  Thus  and  q^^  bear  the  required  relationship. 

Since  wiping  off  the  top  level  of  the  stack  does  not  affect  either 

the  input  or  the  linearization  of  the  stack,  we  have  LlN(a^^)  = 

LIN  (a  )  and  w  *  w  ;  by  definition,  w'  =  u  ’  and  a’ , ,  =  a 

m  m+1  m  J  ’  m+1  m  m+1  m  * 

while  u  '  =  u  and  LIN(a  ')  =  LIN(a  );  hence  w'  =  w  and 
mm  m  m  m+1  m+1 

LIN(a^)  =  LDKa^),  and  we  are  done  with  this  last  case. 


This  completes  the  induction  and  the  proof. 


Q.E.D. 


Theorem  3.70  L(M)  C  L(Mq) 


Proof  If  us  £  L(M) ,  let  d^,...,dn  be  its  accepting  sequence.  Then  h(d^),..., 

h(d^)  is  well-defined.  In  each  case,  either  h^)  h(di+1>  or  Mdt)  = 

£ 

h(di+i);  hence  h(d^)  H  h(dn>.  Since  dn  is  the  last  of  the  accepting 

sequence,  a  »  Sq.Sq;  since  LIN  (a  )  =  LIN  (a  '),  it  follows  that  a  '  = 

^  n  U  n  n  n 

R  qQ'  S  q',  where  q'  -  ^  (qQ',  S,  — l  ).  So  h(dn>  is  the  end  of  an  accepting 

sequence  of  Mq  for  w,  and  we  are  done.  Q.E.D. 


Theorem  3.71  Let  M  be  any  MSP(k)  machine  for  the  LR(k)  grammar  G,  and  MQ 
the  LR(k)  machine  for  G.  Then  L(M)  =  L(Mq)  =  L(G). 


Proof  By  Theorem  3.67  and  Theorem  3.70. 


Q.E.D 


Theorem  3.72  Let  G  be  an  LR(k)  grammar,  u  f  L(G),  and  M  any  MSP(k)  machine 
for  G.  Then  M,  when  presented  with  w,  enters  the  same  final  states  and  in 
the  same  order  as  does  Mq  when  presented  with  w. 

Proof  Immediate  from  the  proofs  of  Proposition  3.66  and  Lemma  3.69. 

Q.E.D. 

Corollary  3.73  If  M  is  an  MSP(k)  machine  for  G  and  w  £  L(G),  then  M,  when 
presented  with  W,  produces  the  left-to-right,  bottom-up  parse  of  o*. 

Corollary  3.74  Hie  number  of  steps  taken  by  M  In  parsing  w  is  equal  to 
the  number  of  steps  taken  by  Mq  In  parsing  u  plus  twice  the  number  of 
predictions  made  during  the  parse  by  M. 

This  last  result  also  comes  from  the  proofs  of  3.66  and  3.69.  We  saw 
there  that  M  precisely  mimics  Mq,  except  for  the  extra  steps  of  making  and 
suspending  predictions.  For  each  such  prediction,  both  these  actions  must 
occur,  which  leads  to  this  last  result. 

We  have  thus  seen  that  our  model  of  multiple-stack  parsing  machines 
does  indeed  capture  the  Idea  we  started  out  with;  we  shall  new  see  how 
this  model  may  be  usefully  employed. 
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CHAPTER  4 

THE  BASIC  TRANSFORMATION 


4.1  Cvcie-free  MSP(k)  Machines 

In  the  preceding  chapter,  we  defined  a  new  k*nd  of  parsirr  machine,  the 
multiple  stack  parsing  machine,  and  showed  that  it  did  precisely  the  same  things 
as  the  canonical  LR(k)  machine  for  the  same  grammar,  except  that  the  MSP(k) 
machine  took  a  few  more  steps  to  do  it.  So  far,  this  new  model  seems  completely 
useless.  But  now  we  shall  restrict  our  attention  to  a  special  kind  of  MSP(k) 
machine,  and  we  shall  see  that  our  painstaking  preparations  will  begin  to  pay 
off. 

The  most  obvious  way  in  which  the  parse  performed  by  an  MSP(k)  machine 
differs  from  that  done  by  an  LR(k)  machine,  is  in  the  condition  of  the  stack. 

The  Iit(k)  machine  uses  a  conventional,  one-dimensional  stack,  while  the  MSP(k) 
machine  must  ha^e  available  a  stack-of-stacks;  a  two-dimensional  stack,  singly 
infinite  in  both  dimensions.  Why  exactly  must  each  stack  level  of  the  MSP(k) 
stack  itself  be  a  potentially  infinite  stack,  rather  than  of  bounded  finite 
width?  Because  in  general,  upon  creation  of  a  new  stack  level,  we  can  not  be 
sure  how  many  symbols  will  have  to  appear  on  that  level  before  it  is  destroyed. 
But  how  do  an  unbounded  number  of  symbols  get  to  appear  on  a  level?  Only  by 
the  existence  of  cycles  in  the  state-graph  of  the  machine.  Each  time  a  transi¬ 
tion  is  made  in  the  machine,  a  new  symbol  is  put  onto  the  stack;  a  sequence  of 
transitions  puts  on  a  sequence  of  symbols.  And  should  some  sequence  of  transi¬ 
tions  return  the  machine  to  the  state  whence  the  sequence  began,  having  placed 
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on  the  stack  tome  sequence  of  symbols,  the  whole, process  might  begin  anew, 
putting  those  symbols  on  again,  and  so  on.  In  a  cycle-free  machine,  however, 
only  a  bounded  number  of  symbols  could  ever  appear  on  a  single  stack  level. 

The  two-dimensional  stack  could  then  be  considered  as  a  one-dimensional  stack 
again,  with  interesting  implications. 

Definition  4.1  If  M  is  an  MSP(k)  machine,  then  a  path  in  M  is  a  sequence  of 
states  of  M,  qj<l2***<*m'  suc^  that  ^or  each  *■»  there  exists  a  symbol  a  and  a 
string  x^  such  that  f(q^,  a^,  x^)  =>  q^+^.  The  length  of  this  path  is  m,  and 
it  is  labelled  by  ct.Oj.  •  •  A  path  q^...^  is  a  straight  path  if  q^  -  q^ 

iff  i  =  j.  A  path  a  cycle  in  M  if  q^  =  q^.  M  is  cycle-free  if  there 

are  no  cycles  in  M. 

Proposition  4.2  If  M  is  a  cycle-free  MSP(k)  machine,  then  during  the  parse  of 
any  string  by  M,  there  are  a  bounded  number  of  symbols  on  any  stack  level  of 

the  stack.  The  bound  is  2n,  where  r.  is  the  length  of  the  longest  straight  path 

in  M. 

Proof  Suppose  that  at  some  time  there  were  some  stack  level  with  more  than  2n 

symbols  on  it;  we  can  assume  that  it  is  the  topmost  level  at  the  time.  Now  any 

topmost  level  is  always  of  the  form  q^  Aj  $2  •••  q^*  where  each  q^  is  a 

state  and  each  A^  is  an  element  of  U  By  the  way  M  operates,  for  each  t 

there  is  an  x^  such  that  q^+^  =  f(q^f  A^+^,  x^)*  Since  M  is  cycle-free,  q^'-qjj, 
must  be  a  straight  path.  If  there  are  more  than  2n  symbols  on  this  level,  it 
must  be  that  m  >  n.  But  n  is  the  length  of  the  longest  straight  path  in  M. 

This  is  a  contradiction,  and  we  are  done.  Q.E.D. 
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In  the  case  where  M  is  a  cycle-free  LR(k)  nachine,  it  is  well-known  that 
M  accepts  a  finite-state  language.  Since  the  stack  never  contains  more  than  a 
bounded  number  of  symbols,  it  is  not  really  needed  at  all,  and  its  use  can  be 
simulated  by  having  the  state  keep  track  of  what  would  be  on  the  stack.  For 
a  general  MSP(k)  machine,  however,  this  is  not  the  case.  Even  if  M  is  cycle- 
free,  that  applies  only  to  each  individual  stack  level;  the  stack  as  a  whole 
is  of  unbounded  size,  because  indefinitely  many  new  stack  levels  may  be  created 
by  repeated  predictions.  M  can  accept  non-regular  languages;  the  only  constraint 
is  that  the  stack  can  only  be  effectively  used  by  means  of  predictions. 

For  a  cycle-free  machine,  the  maximum  nurabe*  of  vocabulary  symbols  that  can 
appear  on  any  stack  level  is  also  equal  to  n,  the  length  of  the  longest  straight 
path  in  M.  Since  there  ere  only  finitely  many  vocabulary  symbols,  it  follows 
that  there  ara  only  finitely  many  different  sequences  of  vocabulary  symbols  that 
can  appear  on  a  stack  level.  Let  us  ignore  the  state  symbols  that  occur  on  a 
stack  level,  and  just  concentrate  on  the  vocabulary  symbols.  We  can  think  of 
each  sequence  of  such  symbols  as  being  a  single  symbol  from  some  other,  finite 
set  of  symbols.  We  shall  consider  the  sequence  AjAjA^.-.A^  as  representing  the 
single  symbol  (A^^A^. .  .A^).  Note  that  A^  is  ’ways  the  name  of  the  predicted 
nonterminal  ^or  that  level. 

Ic  is  thus  possible  for  us  lo  think  of  the  two-dimensional  MS?(k)  stack  as 
being,  in  the  case  of  a  cycle-free  machine,  a  single-dimensional  stack,  on  which 
are  written  symbols  from  some  new  vocabulary,  symbols  of  the  form  (X,  cp)  For 

each  of  these  symbols,  X  is  a  nonterminal  of  the  original  grammar  G,  while  cp  is 
a  sequence  of  terminals  and  nonterminals  from  that  grammar.  Intuitively,  ar 
LR(k)  stack  is  read  vertically;  in  going  to  an  HSP(k)  stack,  we  cut  up  the  LR(k) 
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stack  into  segments,  ana  lay  them  out  horizontally  on  top  of  one  another.  As 
we  have  seen,  reading  the  MSP(k)  stack  in  this  way  makes  it  look  essentially 
the  same  as  the  original  LR(k)  stack.  But  now  we  are  changing  our  perspective, 
reading  the  MSP(k)  stack  vertically,  regarding  each  stack  level  as  a  single 
-.ymbol. 

Using  this  new  perspective,  let  us  observe  what  happens  co  the  stack 
during  the  course  of  an  HSF(k)  parse.  (From  now  on,  our  discussion  is  always 
referring  to  cycle-free  MSP(k'  aachines.)  If  a  terminal  symbol  i3  read  from 
the  inp’  onto  the  stack,  the  topmost  stack  symbol  (stack  level)  changes  from 
Ot,  cp)  to  (X,  cpa).  If  a  prediction  is  made,  where  Y  is  the  name  of  the  predic¬ 
ted  nonterminal,  the  only  change  to  the  stack  is  that  the  symbol  (Y,  £)  is 

placed  on  top  of  the  stack  (i.e.,  a  new  le/el  is  created).  If  a  rediction  is 

made,  some  suffix  of  the  topmost  level  is  replaced  by  a  single  nonterminal;  or, 
as  we  see  thing3  now,  (X,  cpd)  becomes  (X,  cpA).  And  finally,  a  suspension  is 

made  only  if  the  topmost  symbol  is  (X,  X);  the  effect  of  performing  the  suspen¬ 

sion  is  to  eliminate  this  symbol  (discard  the  topmost  stack  level). 

These  four  alternatives  ere  depicted  in  Figure  4.1;  in  each  case,  the 
remaining  input  string  is  Indicated  below  the  stack.  (Note  that  only  the  first 
alters  the  input  string.)  Now  suppose  we  were  watching  a  "stack  movie"  of  such 
a  parse,  oLaervtng  the  changes  made  to  the  contents  of  the  stack  and  the  occa¬ 
sional  subd  disappearing  from  the  in  ut  stream,  without  knowing  what  kind  of 
parse  was  being  performed,  or  the  identity  of  the  underlying  grammar.  Then  by 
a  stretch  of  the  imagination,  we  could  believe  that  we  were  watching  not  an 
MSP  parse,  but  a  deterministic  top-down  (LL)  parse.  Die  identity  of  the  grammar 
driving  the  parte  could  be  induced  from  observation  oc  steps  in  the  parse.  Thus, 
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reading  the  symbol  a  from  the  input  and  replacing  fX,  cp)  as  the  topmost  stack 
symbol  by  (X,  ffla),  looks  exactly  like  the  effect  of  applying  the  rule  (X,  cp)  -* 
a(X,  cpa)  during  a  top-down  parse.  Replacing  (X,  cpoc)  by  (X,  cpA)  could  be 
caused  by  applying  the  rule  (X,  ®Cl)  ■*  (X,  cpA).  Putting  (Y,  f)  on  top  of 
(X,  Cp)  might  be  effected  by  using  the  production  (X,  9)  -*  (Y,  £)  (X,  qjY).  And 
discarding  (Y,  Y)  frcnn  the  top  of  the  stack  can  be  viewed  as  nothing  more  than 
applying  the  rule  (Y,  Y)  -*  f.  Thus,  an  HSP(k)  parse,  when  viewed  in  the  proper 
light,  looks  remarkably  like  an  LL(k)  parse,  based  on  another  granmar. 

We  don't  have  to  sit  around  watching  a  lot  or  parses  in  order  to  determine 
what  this  other  grannar  is — we  can  read  it  directly  off  the  machine,  as  seen 
below.  This  new  gramnar  we  shall  call  becuase  it  depends  on  the  MSP(k) 

machine  chosen. 

4.2  The  Transformation  T^iG) 

Algorithm  4.3  For  a  grammar  G  and  a  cycle- free  MSP(k)  machine  M  associated 
with  G,  the  grammar  T„(G)  is  defined  as  follows : 

1)  The  terminals  of  T^CG)  ere  the  terminals  of  G. 

2)  To  determine  the  nonterminals,  first  namp  every  initial  state  of 
of  K  with  a  distinct  name  (X^,  c)  where  X  is  the  nonterminal  rf 
G  associated  with  the  state.  Then  assign  names  to  all  other 
states  of  M  as  follows:  if  a  state  q  can  be  reached  from  the 
initial  state  named  (X^,  <0  by  a  path  labelled  by  cp,  then  q  is 
given  the  name  (X^,  cp).  The  set  of  all  sue!  state  names  is  the 
set  of  nonterminals  of  T„(G).  The  sentence  symbol  of  T„(G)  is 
is  (S^,  €),  the  name  given  the  starting  state  of  M. 
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3)  The  ru’es  of  T^(G)  are  derived  as  follows: 

1)  If  there  j.s  a  transition  labelled  with  the  terminal  symbol 
a  coming  from  a  state  named  (X^,  (p),  add  the  rule  'X^,  tp)  -♦ 
cpa) 

ii)  If  the  Initial  state  named  (Y^,  €)  is  attached  to  the  base 
state  named  (Xj,  <p}  add  the  rule  (X^,  <p)  *♦  (Y^,  €)  (x^,  tpY) 

iii)  If  (X^,  Cpot)  is  the  name  of  the  final  state  of  M  associated 

with  the  rule  of  G,  A  ■*  <X,  add  the  rule  (X^,  tftt)  -*  (X^,  q>A). 

iv)  For  each  name  (X^,  f),  add  the  rule  (X^,  X)  -♦  £. 

Before  we  give  an  example  oe  this  algorithm,  a  few  cooments  are  in  order. 

First  of  all,  the  naming  procedure  is  not  unique,  neither  for  states  nor  for 
names;  two  states  may  have  the  same  name,  and  one  state  may  have  several  names. 
Only  initial  states  are  uniquely  named.  Secondly,  since  M  is  cycle-free,  no 
state  can  have  more  than  a  finite  number  of  name?;  thus  the  set  of  state  names 
(and  the  set  of  nonterminals  of  1^(0)  is  finite.  Finally,  it  is  easy  to 

see  that  alt  paths  into  the  final  state  associated  with  A  -*  a  must  end  with  a; 

so  all  names  of  such  a  state  will  be  of  the  form  (X,  cpa),  each  giving  rise  to 
a  rule  (X,  cp<j)  -*  (X,  cpA). 

We  shall  establish  certain  conventions  for  subsequent  use.  First  of  all, 
in  our  examples,  if  there  is  but  one  initial  state  associated  with  the  nonter¬ 
minal  X,  we  shall  name  it  Just  (X,  f ) ,  omitting  the  subscript.  Similarly,  we 
shall  often  refer  to  a  typical  nonterminal  of  a  grammar  as  (X,  q>),  dropping 
the  subscript  and  identifying  (X,  f)  as  the  name  of  an  initial  state  associated 
with  nonterminal  X.  However,  we  shall  have  occi  ''ion  to  refer  to  (X^,,  <P^) 
as  one  nonterminal  in  a  sequence  of  them;  In  these  cases,  we  shall  specifically 
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give  a  name  to  the  nonterminal  of  G  associated  with  the  initial  state  (X^,  €) 
(usually  A^);  for  in  this  context,  we  shall  not  *an  that  (X^,  cp^)  and  (X^,  ep ^ ) 
are  successors  of  the  same  initial  state  associated  with  nonterminal  X. 

For  a  simple  example,  consider  the  MSP(l)  machine  shown  in  Figure  4.2. 

It  is  associated  with  the  grammar  S  -*  Ax,  A  -*  aB,  A  -♦  ac,  A  -*  ad,  B  Ax, 

"  -»  Axy,  B  -»  b;  by  inspection,  it  ic  cycle-free.  (State  numbers  are  given 
for  reference  purposes  only.)  To  derive  the  grammar  T^(G),  we  first  apply  aur 
state-naming  procedure.  First,  we  assign  name  (S,  (0  to  state  i  and  (B,  €) 
to  state  8;  this  takes  care  of  the  initi*1  states.  Then  state  2  is  named 
(S,  A),  since  there  is  a  path  labelled  by  A  from  state  1  to  state  2.  Similarly, 
state  3  is  given  the  name  (S,  Ax).  State  4  has  two  names  (S,  a)  and  (B,  a), 
because  there  are  paths  labelled  by  a  from  bath  state  1  and  state  8  to  state  4. 
Furthermore,  both  state  11  and  state  12  have  the  name  (S,  Ax),  because  there  is 
a  path  labelled  Ax  from  state  8  to  state  11  as  well  as  one  from  state  8  to 
state  12.  The  different  lookaheads  associated  with  these  two  paths  do  not 
affect  the  naming  of  the  states.  Completing  the  naming  process,  we  get  the 
information  sunmarized  in  Figure  4.3.  The  set  of  names  in  the  second  column 
is  the  nonterminal  set  of  TM(G),  and  (S,  f)  is  the  sentence  symbol. 

To  apply  the  first  step  of  rule  generation,  we  must  locate  those  states 
which  have  some  transition  on  a  terminal  symbol  coming  out  of  them.  In  this 
case,  this  includes  states  1,  2,  4,  8,  10,  and  11.  For  each  of  these  states, 
we  consider  each  of  its  names  (X,  cp)  and  each  terminal  symbol  o  coming  rut  of 
it,  ana  generate  the  rule  (X,  cp)  °  (X,  cp  a).  Thus  state  8  is  named  (B,  f) 
and  has  transitions  on  a  and  b  leaving  it;  so  we  creato  rules  (B,  €)  -*  a(B,  a) 
and  (B,  6)  *♦  b(B,  b).  Similarly,  state  4  has  names  (S,  a)  and  (B,  a),  with 
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outgoing  transitions  on  c  and  d;  this  gives  rise  to  the  rules  (S,  a)  c(3,  ac)( 
(S,  a)  -*  d(S,  ad),  (B,  a)  ■*  c(B,  ac),  and  (B,  a)  -*  d(B,  ad).  Hie  set  of  rules 
generated  by  this  step  of  the  algorithm  is  given  in  Figure  4.4. 


(S,  €)  -»  a(S,  a) 

(S,  A)  -»  x(S,  Ax) 

(S,  a)  -*  c (S,  ac) 

(S,  a)  -♦  d(S,  ad) 

(B,  a)  -*  c  (B,  ac) 

(B,  a)  **  c.(B,  ad) 


(B,  f)  -♦  a(B,  a) 

(B,  f)  -*  b(B,  b) 

(B,  A)  -*  x  (B,  Ax) 

(B,  Ax)  -♦  y(B,  Axy) 


Figure  4.4 


Next  we  locate  each  base  state  and  examine  each  initial  state  attached 
to  it.  If  (x,  cp)  is  a  name  of  the  base  state,  and  (Y,  €)  the  name  of  the 
attached  initial  state,  then  we  create  the  production  (X,  «p)  -»  (Y,  c)  (X,  cpY). 
In  this  example,  there  is  only  one  base  state,  with  one  attached  initial  state; 
since  the  base  state  has  two  names,  we  get  the  following  two  rules:  (S,  a) 

(B,  £)  (S,  aB)  and  (B,  a)  -»  (B,  £)  (B,  aB). 

Then  we  consider  each  final  state  of  M.  We  have  seen  that  if  a  final 
state  is  associated  with  production  A  -*0L,  then  each  name  of  that  state  will  be 
of  the  form  (X,  tpo.).  Then  for  each  such  name,  we  create  the  rule  (X,  cpx)  -* 

(X,  9A).  In  our  example,  the  rules  generated  are: 
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(S,  Ax)  -»  <S,  S) 

(S,  aB)  -♦  (S,  A)  (B,  aB)  -*  (B,  A) 

(S,  ac)  -»  (S,  A)  (B,  ac)  -♦  (B,  A) 

(S,  ad)  -♦  (S,  A)  (B,  ad)  -*  (B,  A) 

(B,  Ax)  *♦  (B,  B) 

(B,  Axy)  -*  (B,  B) 

Note  that  since  (B,  Ax)  is  the  name  of  both  a  final  state  and  of  a  non-final 
state  with  a  terminal  transition,  there  are  two  different  kinds  of  rules  genera¬ 
ted  with  (B,  Ax)  on  the  left-hand  side:  (B,  Ax)  -*  y(B,  Axy)  and  (B,  Ax)  (B,  B). 

Finally,  we  add  one  rule  for  each  initial  state.  In  this  case,  the  rules 
are  (S ,  S)  -*  £  and  (3,  B)  -*  £.  The  entire  grasmar  is  given  in  Figure  4.5 

(S,  £)  -*  a(S,  a)  (B,  €)  *♦  a(B,  a)  (B,  aB)  -»  (B,  A) 

(S,  a)  -♦  (B,  €)  (S,  aB)  (B,  O  *♦  b(B,  b)  (B,  A)  -♦  x(B,  Ax) 

(S,  a)  -*  d(S,  ad)  (B,  a)  -*  c<B,  ac)  (B,  Ax)  -♦  (B,  B) 

(S,  a)  -*  c(S,  ac)  (B,  a)  -*  d(B,  ad)  (B,  Ax)  -*  y(B,  Axy) 

(S,  aB)  -*  (Sr  A)  (B,  a)  -*  (B,  (B,  aB)  (B,  Axy)  -»  (B,  B) 

(S,  ad)  -»  (S,  A)  (B,  b)  -»  (B,  B)  (B,  B)  -♦  f 

(S,  ac)  -*  (S,  A)  (B,  ac)  -*  (B,  A) 

(S,  A)  ■*  x(S,  Ax)  (B,  ad)  -»  (B,  A) 

(S,  Ax)  ■*  (S,  5) 

(S,  S>  -*  f 


Figxrre  4.5 
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By  the  observations  made  earlier,  it  is  clear  that  for  any  grammar  G  and 
associated  cycle-free  machine  M,  that  the  set  of  nonterminals  given  by  Algorithm 
4.3  is  finite,  and  that  the  productions  generated  are  indeed  well-formed;  thus 
T^(G)  is  indeed  a  well- formed  grammar. 


4.3  The  Relationship  Between  G  and  T^(G) 

We  shall  discuss  at  length  various  properties  of  the  grammar  T^G),  but 

first  we  want  to  establish  the  most  important,  basic  result.  The  grammar  T_,(G), 

n 

read  off  the  cycle-free  machine  M,  generates  precisely  the  set  ^f  strings  which 
M  accepts;  and  since  M  is  an  MSP(k)  machine  associated  with  G,  this  in  turn 
equals  the  language  generated  by  G.  We  shall  prove  this  result  by  showing  that 
a  leftmost  derivation  of  a  string  in  TM(G)  essentially  simulates  the  progress  of 
the  parse  of  that  string  by  the  machine  M. 


Leeina  4.4  Let  M  be  a  cycle-free  HSP(k)  machine  associated  with  grammar  G,  and 
T^(G)  the  grammar  derived  from  M.  Say  w  f  L(M),  w  is  accepted  by  M,  and  consider 
any  configuration  of  M  in  accepting  w.  Suppose  there  are  n  stack  levels  to  this 


configuration,  and  that  _he  rema.ning  input  is  where  is  the  portLon  of 

til 

w  already  read.  Let  Aj  be  the  first  (leftmost)  symbol  on  the  j  stack  level 
of  this  configuration,  and  let  represent  the  string  of  remaining  vocabulary 
symbols  on  that  level.  Then  there  is  a  leftmost  derivation  in  TM(G) 

<S1>  e>  t“i<v  V  <Vi-  Vi)-fv  V- 


where  (X^,  £)  is  the  name  of  the  initial  state  of  the  i  level. 


Proof  We  prove  this  by  induction  on  the  number  of  the  configuration  in  the 
accepting  sequence  for  'J. 

Basis  The  first  configuration  has  a  stack  with  only  one  level ,  on  which  the 

only  inscribed  vocabulary  symbol  is  the  sentence  symbol  T.  Thus  =  S 

and  =  €•  Furthermore ,  none  of  the  input  has  yet  been  read,  so  =  €. 

Now  (Sj,  f)  is  the  name  of  the  starting  state,  which  is  the  initial  state  of 

* 

the  first  level.  Therefore,  since  (S^,  c)  (S^,  €).,  we  are  done. 

Inductive  Step  Suppose  the  lemma  is  true  for  the  first:  m  configurations  in  the 
accepting  sequence  for  w;  we  want  to  show  it  true  for  the  (mfl)st.  Let  us 
examine  more  closely  the  mth  configuration,  as  shown  below.  We  have  indicated 
only  the  vocabulary  symbols  on  each  level. 


except  that  we  have  indicated  q,  the  state  which  M  is  in  at  this  configuration. 
Observe  that  there  is  a  path  in  M  labelled  by  cp^  from  the  initial  state  of  the 
nth  level  to  the  state  q;  therefore,  if  (Xn,  f)  is  the  name  of  the  initial 
state  cf  the  uth  level,  (Xn,  <?n)  is  a  name  for  the  state  q. 

How  can  the  (m+l)st  configuration  follow  from  the  m*"*1?  There  are  only 
four  ways:  by  M  performing  a  reduction,  reading  a  symbol,  taking  a  prediction, 
or  suspending  a  prediction.  We  shall  consider  each  of  these  cases. 

(From  now  on,  we  indicate  the  remaining  input  under  the  stack. ) 


145 


Suppose  M  reads  the  terminal  symbol  a  from  the  input  in  going  from  the  mt*1 

8 fc 

configuration  to  the  (mfl)  .  Then  the  firsc  symbol  of  the  remaining  input  at 
the  m*"*1  stage  must  have  been  a;  we  can  compare  the  two  configurations  as  seen 
below.  For  the  change  to  be  effected,  there  must  be  a  transition  on  the  terminal 
symbol  a  from  state  q  to  state  q'. 


A  q 

n  n  n 

• 

A  cp  q  a  q1 

n  Tn  n  M 

• 

• 

*i  \ 

au  •  u 


But  sine  a  name  for  q  is  (X  ,  Cp  ),  this  means  there  is  a  rule  in  T  (3): 

* 

(X  ,  cp  )  -*  a(X  ,  cp  a).  By  hypothesis,  (S  ,  f)  =>  w  (X  ,  cp  )...(X  ,  cp  ). 
nn  n  n  ^  l  I>  I  n  n  II 

Therefore,  (S^  f )  =»  c^aCX^  cp^a)...^,  cp1>.  But  this  is  precisely  what 

li 

we  have  to  show,  since  at  this  new  stage,  the  input  already  read  is  u;^a,  the 


topmost  stack  level  is  A  cp  a  with  the  same  initial  state  as  before,  and  all 

n  n 


lower  levels  are  unchanged. 


Now  suppose  M  performs  a  reduction  according  to  the  rule  of  G,  B  -»  a, 
in  going  from  the  m^  configuration  to  the  next  one.  Then  the  two  configura¬ 
tions  are  as  pictured  below.  The  right-hand  side  of  the  rule,  a,  must  be  a 

suffix  of  9  ;  SO  cp  =  ta*  Thus  the  state  q  has  a  name  (X  ,  t  &);  since  q  is 
n  n  " 

a  final  state  corresponding  to  the  rule  B  -♦  a,  this  gives  rise  to 
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a  rule  in  'L.(G),  (X  .  *o)  "*  (X.  YB).  Therefore, 

rt  n  n 


<»!•  <=>  t  W  V— ' (X1-  V  L  “l(Xn-  '»  <Vl*  Vl>— <X1-  V- 


which  is  just  what  we  have  to  show. 

If  the  change  of  configuration  is  caused  by  a  prediction, 
get  the  following  two  stacks : 


say  of  we 


A 

n 

<Pnq 

An+1 

q' 

A1 

0 

0 

’I 

and 

A 

n 

A1 

*n  Vl 
*1 

U>  (a) 

2  2 


In  both  cases,  we  show  only  :he  last  state  on  the  topmost  level.  Note  that  in 
the  (m+l)st  configuration,  =  £,  aince  only  the  predicted  nonterminal  is 

written  on  the  new  level  when  a  prediction  is  made.  The  initial  state  for  this 
level  is  q',  which  is  associated  with  nonterminal  A^^;  by  hypothesis,  we  shall 
call  the  name  of  this  state  (X  _ ,  £).  Then  since  (X  ,  cp  )  is  a  name  for  q. 

Hr  X  ti  n. 

and  since  q'  is  an  initial  state  associated  with  the  base  state  qf  the  production 

* 

(Xn,  <Pn)  (xn+1t  <=)  <xn»  ^n+l^  is  3  rule  °f  TM(GV  Therefore»  (Si»  ^  l 

U,l(Xn+l’  ^  (Xn’  <PnArrt-l)  (Xn-l*  ®n-l)***(Xl»  'V’  whlch  iS  what  needs  C°  be 
shown  to  prove  the  induction  for  this  case. 

Finally,  we  must  consider  the  ca3e  where  the  chanj/e  of  configuration  is 

caused  by  the  suspension  of  a  prediction.  Then  ne  ’'before"  and  "after"  3tacks 


are: 
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X 

n 

A 

n 

x  i 

n-1 

Vi 

X  i 
n-1 

Vi 

and 

• 

• 

• 

• 

X1 

*1 

X1 

*1 

W2 

U2 

By  definition,  (X^,  f)  is  the  name  of  the  initial  state  of  the  n*"*1  level  of  the 

m^  configuration,  and  A  is  the  nonterminal  associated  with  this  state,  There- 

* 

fore,  (X  ,  A  )  4  f  is  a  rul_  of  T,.(G).  Therefore,  (S, ,  €)  =»  u  (X  ,  cp  )... 
nn  ’i  1  li  1  n  n 

(X^,  Cp^)  ^  ^l^n-l’  1  ^ *  *  *  w^ic^  is  the  statement  of  the  lemma  for 

s  t 

the  (m4-l)  configuration. 

Thus  the  induction  is  complete,  and  the  lemma  is  proved.  Q.E.D, 

Theorem  4.5  L(M)  c  L(Tt.(G)) 

w 

Proof  Suppose  w  €  L(M).  Then  the  final  configuration  of  the  accepting  sequence 

of  M  for  w,  has  a  one-level  stack,  containing  just  SqgSq,&nd  an  empty  remaining 

input  string  i.e.,  the  input  read  so  far  is  w,  the  entire  input  string.  Then 

•k 

by  the  preceding  lemma,  (S^,  €)  ^  w(S^,  S),  since  (S^,  c)  is  the  name  of  the 

* 

starting  state  qQ.  But  (S^,  S)  €  is  a  rule  of  TM(G).  Therefore  (S^,  p)  ^  w, 

and  since  (S, ,  €)  is  the  sentence  symboJ  of  T  (G),  this  means  cj  €  l(T  (G)). 

1  MM 

C.E.D. 

Since  M  is  an  MSP(k)  machine  associated  with  the  grammar  G,  we  know  that 

L(M)  =  L(G).  Thus  the  theorem  tells  us  that  L(G)  C  L(T  (G)).  Now  we  want  to 

M 

show  the  other  direction,  that  L(T  (G))  ^  L(G).  We  shall  resort  to  a  rather 

M 

different  proof  technique  for  this  theorem.  First  we  establish  a  lemma  that 
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defines  the  relationship  between  nonterminals  of  G  and  those  of  'L.(G). 


Leona  4.6  Let  M  be  a  cycle-free  HSP(k)  machine  for  G,  and  T,,(G)  the  derived 

grammar.  Let  (A,  cp)  be  any  nonterminal  of  Tjj(G).  Then  fer  any  string  of 

*  * 

terminals  id,  if  (A,  Cp)  w  via  a  derivation  In  T  (G),  lien  A  ( ptD  in  G. 

L  MR1 


Proof  Hie  proof  Is  by  induction  on  the  length  of  (d. 


Basis  Assume  |w|  =  0;  i.e.  ,  w  ■»  We  have  to  show  that  if  (A,  cp)  !?  f 
* 

then  A  ^  cp.  We  prove  this  in  turn  by  induction  on  the  length  of  the  deriva- 
K 

tion  of  €  from  (A,  cp)  in  T^G). 


Basis  Assume  that  the  length  of  the  derivation  of  P  from  (A,  cp)  is  one;  that  is 

(A,  ©)  -*  But  inspection  of  the  possible  rules  of  T..(C)  shows  that  there  is 

M 

x 

a  rule  (A,  cp)  *♦  f  if  and  only  if  ©  =  A,  Since  it  is  true  that  A  g  A,  the  basis 
is  complete. 

★ 

Inductive  Step  Assume  for  all  nonterminals  (A,  cp;  of  T^CG),  that  if  (A,  cp)  ^  f 

*  * 

in  less  than  k  steps,  then  A  jf  ©.  We  will  show  that  if  (A,  cp)  f  p  in  exactly 

k  steps,  then  it  is  also  the  case  that  A  ^  cp. 

So  assume  that  (A,  <p)  ^  P.  The  first  step  in  this  derivation  is  an  applies 

tion  of  a  rule  either  of  the  form  (A,  cp)  -*  (B,  p)  (A,  <pB)  or  one  of  the  form 

(A,  cp)  -*  (A,  cp').  (This  must  be  since  no  rule  (A,  cp)  **  a(A,  9a)  can  be  used  in 

thi  generation  of  the  empty  string,  while  C.A,  A)  -*  P  is  obviously  the  last  rule 

applied;  and  there  are  only  these  four  kinds  of  rules  in  T^(G).)  If  the  first 

rule  applied  is  (A,  Cp)  -♦  (B,  ft) (A,  CpB),  then  since  we  are  dealing  with  conf.ext- 

* 

free  derivations,  it  must  be  that  (B,  £)  ?  f  in  fewer  than  k  steps  and  also 

i-i 

*  * 

(A,  ©B)  £  P  i'a  fewer  than  k  steps.  So  by  the  inductive  assumption,  A  j*  crB  and 
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B  =?  <c.  Thus  A  5?  CpB  if  cp,  and  the  induction  is  complete. 
R  R  R 


If  (A  cp)  -*  (A,  Cp'*  *s  the  irst  rule  in  (A,  cp)  5*  f,  it  must  be  that 

L  k-1 

i  =  Ta,  end  Cp*  »Yd,  for  some  Y,  a,  and  D.  Then  (A,  Y  D)  !*  £,  and  so 

L 

★ 

A  y  V  D  by  inductive  assumption.  Since  (A,  Ya)  4  (A,  x  D)  is  a  rule  of 
R 

T  (G),  it  must  be  that  D  -»0t  is  a  rule  of  G.  'lhen  we  have  A  if  ?D  ^  Ya, 
n  R  R 


and  since  Y  a  =  cp,  we  are  done. 


Q.E.D. 


Thus  we  have  sh'T.M  that  if  (A,  cp)  c»  then  A  q  thus  the  basis 

cf  our  first  induction  is  complete. 


•fe 

Inductive  Step  Assume  for  all  nonterminals  (A,  cp)  rf  T^(G),  that  if  (A,  cp)  ^  w, 

wriere  |w|  k,  then  A  ^  epu)  in  G.  Now  we  will  show  that  if  (A,  cp)  £  u,  where 

I  co |  =  k,  then  A  J  Cpw  .  This  proof  will  also  be  by  induction  on  the  length 

*  R 

of  the  derivation  of  u  fr<xn  (A,  CD)  in  TM(G). 


Busts  Just  to  be  safe  and  make  sure  that  the  proof  is  well-consfucted,  we 
si.  11  not  take  r  r  our  basis  the  case  where  th j  length  of  the  derivation  is  one, 
because  if  0,  it  can't  happen  that  (A,  co)  generates  u>  in  one  step.  Let  us 

consider  then,  the  shortest  possible  derivation  of  t  from  (A,  Cp).  Suppose 
U)  =  a ^  By  construction  of  the  rules  of  T^(G),  not  more  than  one  termin¬ 

al  symbol  can  be  introduced  i  :c  a  sentential  form  by  the  application  of  a 
single  rule.  Therefore  there  must  be  at  least  k  steps  in  the  derivation,  to 
introduce  all  the  terminal  symbols.  Furthermore,  the  rightmost  symbol  in  any 
sentential  form  of  the  derivation  is  of  the  foim  (A,  Y);  to  get  rid  of  this, 
l^s t  nonterminal,  the  rule  (A,  A)  "*  will  have  to  be  eventually  appliec. 
Furthermore,  at  least  one  rule  will  have  to  be  applied  to  raak.  the  last  non¬ 
terminal  be  (A,  A).  (Even  if  (A,  cp)  =  (A,  j  >,  the  fi-sr  rule  applied  to  (A,  cp) 
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will  change  Its  second  component  to  something  other  than  A,  requiring  another 
rule  later  to  change  It  back.)  Putting  all  this  together,  we  see  that  the 
very  shortest  possible  leftmost  derivation  of  o>  from  (A,  Cp)  is  the  following. 


(A,  cp)  =*  1(A,  CPa^ 
a^(A,  A)  ^  ai •  •  •  •  ^ 


r  a ^ (A ,  Wa^a2 )  ••  ^  8^2* •  •  a^ (Aj  CPa^a ,, • .  ^  a^a^... 

m.  Any  other  derivation  will  have  to  be  longer.  But 


if  this  derivation  is  valid,  there  must  be  a  final  state  in  M  for  a  rule  of  G, 


A  -*  cpa^a^. . .  a^.  Since  a^...a^  =  u,  this  means  that  A  -♦  cpto  is  a  rule  of  G, 
* 

so  A  if  Cpu-’ .  This  completes  the  basis  of  our  induction. 

K 


inductive  Step  Assume  that  for  all  nonterminals  (A,  cp)  and  for  all  strings  u 
of  length  k,  If  there  is  a  leftmost  derivation  of  length  less  than  b  of  u  from 

★  pi  h? 

(X,  cp),  then  A  cp»).  We  want  to  shew  that  if  (X  cp)  *  w,  then  too  A  ==»  D  w. 

R  L  R 

Observe  that  we  are  doing  a  double  induction,  and  so  have  two  inductive  hypo- 

ic  k 

theses  at  our  dispo3a»,  first  that  ff(A,  cp)  f  u,  where  |wf  <  k,  then  A  ^  epu  ;  and 

* 

secondly,  if  (A,  cp)  =>  U),  where  |  co  |  *=  k  and  the  length  of  the  derivation  is 

L  f 

* 

less  than  m,  then  A  =f  cp  w. 

K 

m 

So  suppose  (A,  cp)  =»w,  |cj/  *  k.  Consider  the  first  step  of  the  derivation. 

L  l-1 

If  it  is  an  application  of  the  rule  (A,  cp)  *+  a(4,  epa) ,  then  (A,  epa)  =»  w1, 

*  L 

where  ^  =  aw1  Thus  by  hypothesis,  A  r>  cna w'  =  qw.  Another  possibility  is  that 

R 

★ 

the  first  rule  applied  ia  (A,  CP)  -*  (B,  c)  (A,  Cp3).  Then  (B,  c)  ^  and 

•fc 

(A,  cpB)  |»  where  'o  =  both  derivations  being  shorter  than  m  steps. 

'there  are  three  cases:  =  e,  ~  or  neither  nor  o)^  equals  c. 

1)  If  w.sg,  ;hen  (B,  f)  ^  f  and  we  have  already  established  this  implies 

1  h  + 

*  *  * 

B  =*  €;  furthermore,  (A,  CpB)  =*  u  in  fewer  than  m  steps,  so  b>  hypothesis  A  ^  cpBw. 
R  ^ 
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if  ★ 

But  since  B  5  f  and  u  Is  a  string  of  terminals,  this  means  A  ®  cpu. 

R  R 

if  "ft 

2)  If  w.  =  £,  then  (B,  <c)  r>  w  In  fewer  than  m  steps,  so  B  ??  w.  And  since 

L  R 

(A,  t?B)  «  f,  we  know  that  A  j»  CpB.  Thus  A  ^  CpB  ^  Cpu.  3)  If  w  t  f  and 

co ^  #  f,  then  (B,  p)  generates  a  string  shorter  than  k.  Then  by  the  first 

if  if 

Inductive  hypothesis,  since  (A,  f)  ^  an^  K  k,  we  have  B  w^. 

*  ...  * 

Similarly,  since  (A,  CpB)  ^  I  u2  *  ^  'c*  we  ^ave  ^  r  cPftJ2*  Therefore 

★ 

A  g  =*  cpu. 

Thus  all  cases  are  satisfied,  and  so  if  (A,  cp)  ?  w>  !wl  =  k,  we  have 

L 

* 

that  A  »  «?>'.  This  completes  our  second  Induction,  which  In  turn  completes 
R 

*  t  i 

cur  first  one:  namely,  we  have  established  that  If  (A,  cp)  ^  w,  lwl  =  k,  then 


A  epci.  Thus  all  Inductive  steps  are  complete,  as  Is  tha  proof. 
R 


Q.E.D. 


Theorem  4.7  I(T„(G))  C  L(G). 

M 

if  if 

Proof.  If  w  c  L(T..(G)) .  then  (S.  .  c)  ?  u.  Then  by  the  lemma.  S  ?u,  since  S  Is 
MIL  R 

associated  with  the  starting  state,  which  Is  named  (S^,  £).  Q.E.D. 

Theorem  4.8  L(T..(G))  -  L(G). 

"  M 


Proof.  By  Theorems  4.5  and  4.7 


Q.E.D. 


4 . 4  Some  Technical  Lemmas 

Before  we  proceed  to  discuss  the  full  Implications  of  the  preceding  theorems 
and  the  precise  nature  cf  our  transformations,  we  shall  establish  a  few  more 
results  that  will  be  very  useful. 
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A  brief  comment  is  appropiiate  here  on  the  effect  of  €  -  transitions  in 
M  on  the  generation  of  T^(G> .  No  rule  will  be  generated  of  the  form  (X,  cp)  -» 
<5(X,cp£),  if  there  is  an  ^-transition  from  an  (X,  cp)  state,  since  f  is  not  r 
terminal  symbol.  (Such  a  rule  would  be  a  poor  thing  to  have  anyway,  since  it 
really  is  (X,  cp)  -*  (X,  cp).)  However,  an  ^-transition  always  goes  to  a  final 
state  for  a  rule  of  the  form  A  -*  €.  Thus  we  will  get  a  rule  (X,  cp)  -*  (X,  cpA) 
in  T^G). 

Lenrna  4.9  If  M  is  a  loop- free  MSP(k)  machine,  then  the  state-naming  procedure 
of  Algorithm  4.3  will  assign  any  given  name  (X,  cp)  to  at  most  one  non-final  state. 

Proof  We  have  noted  that  it  is  possible  for  two  or  more  different  states  of 
M  all  to  be  given  the  name  (X,  cp);  this  lenma  maintains  that  at  most  one  of  these 
states  will  be  non-final.  By  the  naming  procedure,  a  state  q  is  given  the  name 
(X,  cp)  if  there  is  path  to  it  from  the  initial  state  named  (X,  €)  so  that  the 
path  spells  out  cp.  If  cp  =  cp^q^ . . . cpn ,  the  transitions  along  the  path  must  *>e 
on  the  cp^,  except  possibly  including  some  transitions  on  P  (since  inclusions  of 
f  does  not  affect  the  spelling  of  the  path).  But  the  only  way  ^-transitions 
can  occur  in  the  machine  M  is  if  some  state  has  an  item  of  the  form  A  *♦  .  €(u); 
then  there  will  be  an  ^-transition  from  that  state  to  the  final  state  of  the 
rule  A  *♦  €•  Thus  f.- transitions  go  only  to  final  states;  so  if  there  is  an 
^-transition  in  a  path  spelling  out  cp,  from  initial  state  (X,  f)  to  a  state 
named  (X,  cp),  it  can  only  be  at  the  very  end,  in  which  case  the  (X,  cp)  state 


will  be  a  final  state. 
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Tlus  If  non-final  states  q  and  q'  are  both  named  (X,  ep) ,  here  must  be 
paths  of  length  exactly  n  from  Initial  state  (X,  £)  to  each  of  the  states. 

Since  q  and  q'  are  different  states,  the  paths  are  not  identical,  Let  the 
sequence  of  states  along  the  path  to  q  be  q^,  where  q^  is  the 

Initial  state  'X,  f),  q^  is  the  state  q,  and  q^  =  f(q^p  80me  '‘i» 

similarly  the  path  to  q'  goes  through  qg',  q^',...,q'n.  Now  qg  =  qg't  and 
qti  jt  qn'*  thus  there  is  some  smallest  i  such  that  ^  q^',  Then  for 

some  r±  and  t^',  qt  =*  fCq^,  «Pt,  Tj)  and  q^  =  fCq^j,  tp±,  T^'),  since 
q^^  =  q^  j1.  But  by  construction  and  definition  of  MSP(k)  machines,  a  state 
has  two  successors  under  the  same  symbol  only  if  at  least  one  of  them  is  final. 
Thus  either  q^  or  q^'  is  final.  So  in  order  for  the  paths  to  q  and  q'  to  be 
well-defined,  it  mist  be  that  i  =  n,  and  either  q  or  q'  is  final,  which  is 
what  we  wanted  to  prove.  Q.E.D. 

The  precede  .12  result  assures  us  that  we  can  refer  to  the.  non-final  state 
named  (X,  cp).  'i.  e  next  Benina  is  cumbersome  to  state,  and  rather  technical. 

But  it  establishes  a  connection  between  the  structure  of  the  machine  M  and 
the  derivations  of  strings  generated  by  the  grammar  derived  from  M. 

Lemma  4.10  Let  M  te  a  loop- free  MSP(k)  machine  fc-  the  jranmar  G,  and  T  (G) 

k  M 

the  grammar  derived  from  M  by  Algorithm  4.3.  Suppose  that  (S^,  f)  H  ^ 

i  k  *  it  * 

w^(X,  cp)  Y  H  l  ^1  W2  '  »  *-3  a  derivati°n  in  tm(G)  ,  where  £  VT  . 

Then: 

1)  If  Cp  =  f,  then  for  some  state  q  and  lookahead  p,  g^Cq,  P)  =  q', 

l  k 

where  q'  is  the  Initial  state  named  (X,  f)  and  p  =  H  /  k. 
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2)  If  cp  +  €,  let  cp  =  t  a ,  where  o  €  VN  U  VT>  and  t  V  U  VT>*  and  let 

q  be  the  single  non-fiual  state  named  (X,  r).  Consider  the  first  rule 

k  *  k 

applied  in  the  leftmost  derivation  (X,<p)  =»  w  H  . 

L  2 

a)  If  this  rule  is  (X,  o)  -*  a(X,  t  ct  a)  or  (X,  to)-*  (Y,  £) 

(X,  t  a  Y),  then  f(q,  ct,  p)  =  q’,  where  q'  is  the  single  non- 

i  k 

final  state  named  (X,  to)  and  p  =  H  /k. 

b)  If  this  rule  is  of  the  form  (X,  t  a)  -»  (X,  1)) ,  where  t  o  =  a  p 

and  1|  =  a  A,  then  f(q,  o,  p)  =  q',  where  q  is  the  final  state 

i  k 

for  the  rule  A  -*  P  and  p  -  - •  /k. 

.)  If  this  rule  is  (X,  X)  -*  €  (i.e.,  T  =  £  and  o  =  X),  then 

f (q,  X,  P)  =  POP  where  P  =  H  k/k. 

Proof  This  lemna  tries  to  establish  a  relationship  between  the  structure  of 
the  machine  M  and  the  nature  of  derivations  in  the  grammar  T^G)  derived  from 
it.  We  will  prove  it  by  induction,  in  the  following  way.  By  inspection,  we 

can  see  that  the  only  part  of  each  case  that  requires  much  in  the  way  of  proof 

is  the  fact  that  P  =  H  k/k.  That  is,  for  each  such  case,  it  is  apparent 
that  the  transition  in  question  is  defined  as  required  for  some  lookahead  p. 

For  example,  if  (X,  t  ct)  -  a(X,  t  g  a)  is  a  rule  of  TM(G),  it  could  o*uly  have 
arisen  from  an  a-transition  out  of  a  non-final  state  named  (X,  T  ct)  ;  the  fact 
that  there  is  a  non-final  state  named  (X,  t  means  there  is  a  transition  on 
ct  from  a  non- final  state  named  (X,  t)  to  some  other  non-final  state  (which 
will  thus  be  named  (X,  T  ct)).  There  will  be  a  number  of  lookaheads  associated 

with  this  CT-transition  --  the  lemma  requires  that  one  of  these  be  equal  to 

u2  -j  k/k.  Similarly,  for  case  b),  if  (X,  t  ct)  ->  (X,  T))  is  a  rule,  where 

t  ct  =  a  p  and  T|  =  a  A,  then  the  final  state  for  rule  A  -*  P  of  G  is  named 
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(X,  t  o),  and  so  there  is  a  ^transition  to  it  from  the  non-final  state  named 

(X,  t).  Similar  analyses  hold  for  the  other  cases.  What  remains  for  us  to 

establish  is  that  for  each  of  the  transitions  in  question,  one  of  its  associated 

i  k 

lookaheads  is  equal  to  H  /k.  We  vill  do  this  by  showing  for  each  i  i  k, 
there  is  a  lookahead  p  associated  with  the  transition  s’ju-.h  that  p/i  =  H  /i. 
For  the  case  i  =  k,  p/i  -  p,  and  our  result  will  be  established.  Thus  we  do 
induction  on  i. 

Basis  i  =  0.  This  case  is  trivially  true,  since  for  any  p  at  all,  p/0  =  f  = 

u>2  -I  Vo. 

Induction  We  assume  that  the  lerana  holds  for  lookahead  p  in  each  case  with 

i  Ic  lc 

p/i  =  u2  H  /i.  W  '  want  to  show  p/i'-l  =  -j  /i+1.  (Of  course,  we're 

assuming  i  <  k. )  We  will  examine  one  case  at  a  time.  We  will  start  with  case 

2a  first. 

Case  I  Assume  the  first  rule  being  applied  in  (X,  ep)Y  H  ^  is 

(X,  To)-*  a(X,  T  c  a).  Then  the  picture  in  Figure  4.6  obtains. 


Figure  4.6 
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and  may  help  in  seeing  what's  going  on.  We  know  there  is  an  a-transition  from 
the  non-final  (X,  to)  state,  and  that  there  is  a  o-transition.  from  the  non¬ 
final  state  named  (X,  t)  to  that  non-final  (X,  T  a)  state,  with  lookahead  p 
such  that  p/i  =  '^2  /i. 

If  If  ^  1, 

By  assumption,  (X,  t  a)  Y  H  ^  a(X,  t  a  a)  Y  H  H  *  ^us 

k  A 

a  is  the  first  symbol  of  u>2>  ^e£  ^  =  aW3»  Then  (x»  T  o  a)  H  L  ^3  ~ ^ 

Now  the  first  rule  applied  in  this  derivation  will  either  be  of  the  form 
(X,  t  a  a)  -*  b(X,  t  o  ab),  (X,  r  a  a)  -*  (Y,  g)  (X,  t  o  aY),  or  (X,  r  a  a)  -* 

(X,  cp’ ) ;  that  is,  it  will  not  be  (X,  X)  -*  €.  Then  by  induction,  there  will  be 
an  a-transition  from  the  non-final  {X,  T  a)  state  to  a  (X,  t  0  a)  state  (which 
is  final  or  not  depending  on  which  kind  of  rule  is  being  applied  to  (X,  T  a  a)), 
with  associated  lookahead  p1  such  that  p'/i  =  -j  /i. 

This  means  there  is  some  item  I  in  the  non-final  (X,  to)  state  of  the  form 
A  -»  Yl.ay2(u),  suchthat^-J  k/i  €  FIRS^Cy^).  Therefore  *>3  Hk/i+l  e  FIRSTi+1 
(ay^i).  N°w  since  c  is  a  non-mill  symbol,  it  is  easy  to  see  that  every  essential 
item  of  the  ron-final  (X,  T  u)  state  has  a  0  in  the  pre-dor  position.  Either 
the  item  I  is  essential  or  it  is  not;  if  it  is  not,  it  is  the  leftmost  descendant 
of  some  essential  item.  Thus  in  either  event  there  is  some  essential  item 
B  -♦  a  0.  0(tt)  such  that  a  (u  H  k/i)  p  FIRSTi+1  (Ptt).  But  a(^3  H  k/i)  = 

"I  k/i+l  =  u2  H  k/i+l.  And  if  B  -»  a0.  p(xr)  is  an  item  of  the  (X,  t  0)  state, 

B  -*  a.0p(TT)  is  an  item  of  the  non-fincl  (X,  T)  state.  Thus  for  some  lookahead 
p",  f (q,  0,  p")  *  q',  with  p"/i+l  =  <*>2  H  k/i+l. 
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Caae  II  Assume  that  the  first  rule  in  (X,  tp)  ¥  t  ?  u,  “I  is  (X,  t  a)  ■+ 

(Y,  £)  (X,  t  a  Y).  Then  we  have  the  picture  shown  in  Figure  4.7.  W>  want  to 

i  k 

show  that  for  some  lookahead  p  as  indicated,  p/i+1  =•  /i+1. 


Figure  4.7 


We  know  that  (Y,  f)  (X,  T  a  Y)  Y  — |  k  |»  u2  H  If  ^  “  f »  then  we 

have  by  induction  that  for  some  lookahead  p  in  the  desired  position,  p/i  = 
u>2  —  I  / i  =  H  /i  =  H  .  By  the  construction  of  the  machine  M  and  the 
nature  of  H  as  a  right-padding  symbol,  if  the  first  i  symbols  of  a  lookahead 
p  are  all  H  's,  then  the  entire  string  is  H  ,  and  so  p/i+1  =  H  Thus 

we  may  assume  that  u2  t  £• 

The  first  rule  in  the  derivation  (Y,  c)  (X,  t  a  Y)  T  "I  k  Uj  H  k 

will  be  applied  to  (Y,  <-) ;  let  us  assume  that  this  rule  is  (Y,  f)  -*  a(Y,  a). 

Then  (Y,  a)  (X,  T  a  Y)  Y  -|  k  ^  H  k,  where  u>2  =  a0l3*  1,160  bY  induction, 

there  is  an  a-transition  from  the  non-final  (Y,  £)  state  (which  is  the  Initial 

i  k 

state  for  Y),  to  a  (Y,  a)  state,  with  lookahead  p  such  that  P/i  =  '  /l* 

Therefore  there  1  s  an  item  in  the  (Y,  e)  state  of  the  form  B  -*  .aa(rr),  such  that 
u3  H  k/i  c  FIRST^ (ttrr).  Therefore  u>2  -jk  /i+1  c  FIRSTt+1(a  a  tt). 
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Let  us  recall  that  an  initial  state,  when  taken  together  with  the  base 
state  with  which  it  is  associated,  forms  the  completion  of  the  essential  items 
of  that  base  state.  Thus  every  item  of  the  initial  state  is  a  descendant  of 
seme  essential  item  in  the  base  state.  Thus  there  is  some  essential  item  I  in 
the  associated  base  state  such  that  H  /i+1  €  FIRST^+^(I).  But  this 
associated  base  state  is  the  (X,  T  a)  state.  Thus  as  we  argued  before  in 
Case  I,  there  will  be  a  otransition  from  the  (X,  T)  state  with  some  associated 
lookahead  P"  with  p"/i+l  =Uj  H  ^/i+l. 

The  preceeding  argument  rested  on  the  assunption  that  the  rule  applied  to 
(Y,  was  (Y,  f)  a(Y,  a).  The  only  other  possibility  t’ould  be  an  applica¬ 
tion  of  a  rule  (Y,  f)  (Y,  A).  This  rule  could  be  in  the  grammar  T  (G)  if 
there  were  in  M  an  ^-transition  from  the  initial  state  for  Y  to  the  final  state 
for  the  rule  A  -*  €.  But  even  in  this  case,  there  must  be  an  eventual  applica¬ 
tion  of  a  rule  (Z,  a)  -*  a  (Z,  a  a),  since  ^  £.  Let  us  assume  for  the  moment 

* 

that  this  rule  is  (Y,  a)  -*  a(Y,  a  a),  and  that  the  derivation  (Y,  <:)  =»  (Y,  a) 

consisted  solely  of  rules  (Y,  P)  ■*  (Y,  P  A),  each  of  which  arises  from  an 

^-transition  from  the  state  named  (Y,  P)  to  the  final  state  for  A  -*  f.  It  is 

* 

thus  easy  to  see  that  a  **  c.  By  induction,  there  is  an  a/p  transition  from  the 

(Y,  a)  state  to  a  (Y,  a  a)  state  such  that  p/i  =  ^  H^/i.  Thus  there  is  a 

lookahead  p'  on  entrance  to  the  (Y,  Cl)  state  such  that  p'/i+l  =  ^  H  ^/i+1. 

'ic 

Since  a  =»  £,  any  lookahead  on  entry  to  the  (Y,  Ct)  state  is  also  a  lookahead 
on  an  r-transition  out  of  the  (Y,  £)  initial  state.  Thus  for  some  item  I  in 
the  base  state  associated  with  the  (Y,  c)  3tate  (namely,  the  non-final  (X,  t  a) 
state),  ^2  H  k/i+l  f  FIRST^  ^(1).  Therefore  there  is  some  lookahead  P"  on 
entrance  to  the  (X,  T  a)  3tate  such  that  p"/i+l  =  H  ^/i+1,  which  is  the 
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desired  result.  Figure  4.8  depicts  the  local  structure  of  the  machine  M  and 
the  various  lookaheads. 


i 

I 


Figure  4.8 


What  other  possibility  exists  for  the  first  application  of  a  rule  that 

i  k  * 

introduces  a  terminal  symbol  in  the  derivation  (Y,  f)  (X,  t  ct  Y)Y  H  ^ 

i  k  * 

k’2  H  ?  There  is  always  the  chance  that  (Y,  £)  =»  (if  for  example,  Y  6 

is  a  rule  of  G),  and  that  the  introduction  in  question  occurs  later  in  the 

derivation;  or  that  for  soma  Ct,  (Y,  a)  -*  (Z,  £)  (Y,  Cl  z) ,  end  that  the 

terminal  introduction  is  via  (Z,  p)  *♦  a(Z,  p  a).  In  any  of  these  remaining 

ca3es,  a  tedious  extension  of  the  preceeding  argument  suffices  to  establish 

the  desired  result.  The  motif  of  the  proof  is  that  corresponding  to  the 

generation  of  the  terminal  symbol  there  will  be  an  entry  to  some  state  q  of 

M  with  lookahead  p  such  that  p/i  =  H  *Vi,  with  entry  to  this  state's 

predecessor  consequently  having  lookahead  p',  p'/i+l  =  *1  /i+1;  and  since 

none  of  the  intervening  steps  in  the  derivation  introduce  any  terminal  symbols, 

the  entrance  to  q's  predecessor  will  have  the  same  lookaheads  as  the  entrance 

to  the  (X,  t  a)  state. 
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Case  III  Assiane  that  the  first  rule  in  (X,  cp)  Y  H  k  *  u>  -j  k  is  of  the 
form  (X,  cp)  (X,  €p' ) .  That  is  (X,  t  a)  =  (X,  cp)  =  (x,  a  0)  and  (X,  cp ' )  - 
(X,  d  A).  Then  we  have  the  picture  in  Figure  4.9,  where  c  is  the  lest  symbol 
of  0. 


Figure  4.9 

The  analysis  in  this  case  reseiiibles  that  of  Case  II.  First  of  all,  if 
w2  =  C*  t*le  induction  is  trivially  true,  just  as  in  the  previous  cases;  so 
we  may  assume  w  i*  c.  Therefore  in  the  derivation  (X,  a  A)  Y  -j  k  »  j  H  k> 
there  is  the  application  of  at  least  one  rule  that  introduces  a  terminal 
symbol  into  the  sentential  form.  Let  us  assume  for  the  moment  that  the 
first  such  rule  is  applied  to  (X,  a  A) ;  i. e. ,  (X,  a  A)  Y  -J  k  f  a  (X,  a  A  a) 

T  H  k  ^  W2  _ ^  ^>en  Ot,  a  A  a)  Y  -j  k  £  H  k,  so  by  induction  the 

a-transition  from  (X,  a  A)  to  (X,  a  A  a)  is  associated  with  lookahead  p, 
p/i  *  -j  k/i.  Therefore  the  A-entry  to  (X,  ct  A)  from  (X,  Ct)has  associated 
lookahead  p',  p'/i+l  =  H  k/i+l.  But  since  A  -*  0  is  a  rule  of  G,  the 
lookaheads  on  entrance  to  the  (X,  a  0)  state  must  be  the  same  as  those  on  entry 
to  the  (X,  <X  A)  state.  Therefore,  p'  is  also  a  lookahead  on  entry  to  the 
(X,  <1  0)  state,  which  is  the  (X,  <p)  state;  and  so  we  are  done. 
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If  the  first  rule  which  introduces  a  terminal  symb  >1  is  not  applied  to 
(X,  CC  A),  but  later  in  the  derivation,  an  extension  of  the  above  argument 
suffices.  Once  again,  the  key  point  is  to  show  that  the  lookahead  on  entry 
to  the  (X,  a  p)  state  is  the  same  as  the  lookahead  on  entry  to  the  state  named 
by  the  nonterminal  to  which  the  terminal-introducing  rule  is  applied.  The 
argument  is  straightforward,  but  tedious,  and  follows  the  pattern  of  the 
previous  case. 

Case  T9  Assume  the  first  rule  in  (X,  <p)  f  H  ^  H  ^  is  (X,  X)  ■+ 

That  is,  (X#  <?)  *  (X,  X),  so  T  =  £  and  o  =  X„  Here  we  have  the  picture  of 
Figure  4.10. 


I 
i 

POP 

Figure  4. 10 

As  before,  the  case  is  trivial  if  u>„  =  so  we  assume  w 2  ^  £.  Let  us  say 
that  the  first  nonterminal  symbol  of  T  is  (Y,  a  X);  it  must  be  of  this  form, 
since  (X,  f)  must  have  been  introduced  into  the  sentential  form  by  (Y,  ct)  -+ 

(X,  f)  (Y,  a  X).  Then  the  base  state  associated  with  intital  (7,  f)  state  is 
named  (Y,  a).  Once  acair.,  let  us  consider  the  fir3t  rule  in  the  derivation 
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that  Introduces  a  terminal  symbol.  Assume  this  rule  i3  (Y,  a  X)  -♦  a(Y,  a  X  a) 

Then  by  induction,  the  a-transition  from  (Y,  a  X)  to  (Y,  a  X  a)  has  an  associated 
lookahead  p  sUch  that  p/i  -  /i.  Therefore,  the  X-transitinr.  from  (Y,  a)  to 

(Y,aX)  has  associated  lookahead  p' ,  p'  /i+1  =^2'*  k/i+l.  Now  recall  from  the  definition  of 
MSP(k)  machines,  that  if  q  is  an  intitial  state  named  (X,  p)  and  q1  is  the  base 
state  associated  with  q,  that  f(q,  X,  P")  =  POP  iff  p”  €  FOLLOW^  (q',  X).  In 
our  case,  since  p'  is  a  lookahead  on  the  X-transition  from  (Y,  Ct)  to  (Y,  a  X), 
it  must  be  that  P'  is  in  the  k-follow  set  of  X  in  the  (Y,  a)  state.  Therefore, 
since  the  (Y,  Cl)  state  is  the  base  associated  with  the  initial  (X,  c)  state, 
p'  will  also  be  a  lookahead  for  the  X-transition  from  that  initial  state  to  VOP. 
Thus  there  is  a  lookahead  p'  on  entry  to  the  POP  state  such  that  P'/i+l  = 

H  k/i+l,  which  was  the  desired  result. 

Now  what  if  the  first  rule  that  introduces  a  terminal  is  not  applied  to 
(Y,  Ct  X)?  If  it  is  applied  to  seme  nonterminal  (Y,  p)  such  that  (Y,  a  X)  =» 

(Y,  p),  it  is  apparent  that  the  lookahead  on  entry  to  the  (Y,  p)  state  is  t'’" 
sane  as  it  is  upon  entry  to  (Y,  a  X),  and  the  above  argument  suffices.  If  it 
is  applied  to  some  other  nonterminal  (Z,  P)  further  along  in  Y,  or  introduced 
into  the  derivation  by  (Y,  a')  -»  (Z,  <c)  (Y,  a'  Z),  the  same  kind  of  argument 
outlined  previously  applies,  the  key  idea  being  that  lookaheads  on  entry  to 
(Z,  p)  will  be  the  same  as  those  on  entry  to  (Y,  a  X). 

i  k 

Case  V  This  is  the  case  where  cp  =  £.  We  want  to  show  that  if  (X,  f)  H 
f  u>  -J  xt  then  there  is  a  lookahead  0  on  entry  to  the  (X,  c)  initial  state 

such  that  p  /i+1  =  <.)2  H  The  analysis  used  in  case  II  is  directly 

applicable  here.  There  we  established  that  there  was  an  item  I  in  the  initial 
state  (and  hence  in  the  associated  base)  such  that  •  /i+1  f  FIRST^^(I). 
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But  by  definition  and  construction  of  an  MSP(k)  machine,  any  member  of 
FIRST^(I),  where  I  is  In  the  initi'.l  state,  is  a  lookahead  for  entry  to  the 
initial  state.  Thus,  since  i+1  £  k,  then*  is  s.ruo  lookahead  p,  on  entry  to 
the  initial  state,  with  p/i+1  =  H  ^/itl. 

This  is  the  final  case,  and  we  have  established  the  induction  step  of 
the  proof.  Thus  the  proof  of  this  lemma  is  complete.  Q.E.D. 

4.5  The  LL(k)-ness  of  T^(Gl 

The  preceed*,..g  lenrna  is  not  so  interesting  in  itself,  but  it  enables  us 
to  establish  the  following,  which  is  the  converse  of  Lemma  4.4. 

Lemma  4.  P.  Let  M  be  a  cycle-free  MSP(k)  machine  for  G,  T^(G)  the  derived 

grammar.  Suppose  w  c  LCT^G)),  and  that  (S^,  c)  H  ~  ^l^n’  "n' 

i  k  *  i  k  I  k 

(X  .,  ®  - ). ..  (X, .  «. )  H  f  I*'1w0  H  =  D  H  .  "Then  upon  applying  M  to 

n-i  n-i  11  Lir 

w  H  k  K  is  eventually  in  the  following  configuration: 


whe^e  is  che  nonterminal  associated  with  the  initial  state  named  (X^,  c). 
Proot  T»  prove  this  by  induction  on  the  length  of  the  leftmost  derivation  oi 

Wl^Xn’  'Pp  from  vSl’ 
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Basis  If  the  length  of  the  derivation  is  zero,  then  we  have  yet  to  read  any 
Input,  so  u2  =to,  and  (o^X^  9n)***(xi»  9^)  =  (Sp  €)•  111113  the 

statement  Is  trivially  true  in  this  case,  since  tne  condition  of  the  stack 
before  any  input  is  read  is  precisely  as  required:  one  level,  with  S  the 
only  vocabulary  symbol  on  it. 

Inductive  Step  Suppose  the  lenma  is  true  for  derivations  of  length  m;  we 
shall  s'ow  it  true  for  derivations  of  length  m+1.  We  shall  proceed  by 
considering  the  different  possibilities  for  the  last  rule  applied  in  the 
derivation. 

Case  I  First  assume  that  the  last  ru*?  applied  in  th'  derivation  is  of  the 
form  (X,  9)  -♦  a(X,  Cp  a).  Then  we  have  (Sp  £)  ®  {Pn)*-**(x1»  ®j_)  J 

to  a(X  ,  <p  a). . .  (X  ,  9  ).  We  have  to  show  that  after  M  has  read  u  a,  it  is 
1  n  n  1  1  1 

in  the  following  configuration: 


By  inductive  assumption,  ve  know  that  after  reading  co^,  M  is  in: 


,  where  q  is  the  state  of  M. 
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Assume  first  that  to  t  80  <P  =  T  c,  o  £  V  U  V  .Tc(V  UV  K  Since  (X  ,  cp  ) .» 

n  n  n  f/'zNI  n  n 

a(X  ,  Cp  a)  is  a  rule  of  T„(G),  we  know  there  is  a  non-final  state  of  M  named 
n  n  n 

(X^,  Cpn)  and  that  there  is  an  a-transitirn  from  it.  Furthermore,  q,  the  state 
of  M  after  reading  is  named  (X^,  Cp^);  however  we  do  not  know  ab  initio 
that  it  is  the  non-final  state  so  named.  However,  we  can  deduce  this  by  using 


the  preceeding  lemma. 

Let  us  examine  the  topmost  stack  level  of  K  after  reading  since 

cd^  =  T  cr,  we  may  consider  it  as:  t  q1  a  q.  Here  q  and  q1  are  states  of  M. 

The  last  step  in  the  operation  of  M  entailed  an  entrance  to  state  q  (whether 

after  a  read,  reduce,  or  suspension)  from  state  q1,  on  symbol  O  and  lookahead 

|  ^/k.  By  Lemma  4.10,  since  (X^,  t  CT)...(Xj,  crO  “j  ^  H  ^  starting 

L 

with  (X  ,  t  o)  -»  a(X  ,  to  a),  thee  j.s  a  transition  from  the  non-final  state 
n  n 

named  (X  ,  t)  (which  in  our  case  is  q1)  to  the  non-final  3tate  named  (X  ,  T  cr) , 
n  n 

• 

on  symbol  o  with  lookahead  H  /k.  Since  the  machine  M  is  deterministic, 

this  means  that  our  state  q  is  -.he  non-final  (X,  t  cr)  state. 

If  =  a(^  we  have  that  (".  ,  T  <y  a)  (X  . ,  CD  , ). . .  (X. ,  CD.  )  ^  f 

l  3  n  n-J.n-1  11  L 

i  k 

”1  .  Then  by  the  lemna  again,  there  is  a  transition  from  the  non-final 

i  k 

(Xn,  T  cr)  state  to  some  (X^,  T  cr  a)  state,  on  symbol  a  with  lookahead  •  /k 

After  reading  u:^,  we  have  ascertained  that  M  is  in  the  non-final  state  named 

(xn,  T  cr),  with  the  remaining  input  being  ~j  K,  or  a  H  \  Then,  since  M 

is  deterministic,  and  there  is  an  a-trans^tion  from  that  state  in  M  with  lookahead 
i  k 

i  /k,  the  action  that  M  will  take  at  this  juncture  will  indeed  be  to  follow 
that  transition,  thus  reading  the  symbol  a  onto  the  stack,  and  leaving  M  in  the 
desired  configuration.  (We  had  to  go  through  this  latter  analysis  to  make  sure 
at  this  point  M  would  not  decide,  based  on  the  lookahead,  to  make  a  prediction.) 
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If  ®n  =  p,  we  have  (XQ,  ep^'4  -|  k  ^  a^xn»  a)***^,  tp^  -)  lc 

"k  tic  ,  k  *  *  k 

J  .  Then  (Xn,  a).  .  (X^,  Cp^)  - j  =*  “}  *  aa<*  80  ^Y  C^e  ^enBna» 

we  know  that  there  is  an  a-transition  from  the  non-final  (X  ,  €)  state  to  some 

n 

i  k 

(X^,  a)  state,  on  symbol  a  with  lookahead  u>^  — {  /k.  We  know  that  after  read- 

ing  w  M  is  in  a  btate  named  (X  ,  p);  but  can  we  be  sure  that  this  is  the 
l.  n 

initial  (X  ,  P)  state?  Perhaps  M  read  an  P  and  went  to  some  final  state  also 
n 

named  (Xq,  c) .  We  can  discount  this  possibility  as  follows:  ’Sven  if  this 
were  the  case,  prior  to  this  ^-transition,  M  must  have  been  in  the  initial 
(Xn,  P)  state,  with  the  remaining  input  being  a  H  Then  if  an  ^-transition 

I  k 

were  to  be  effected,  it  would  be  on  lookahead  a  u  -f  /k.  But  we  already  know 

that  there  is  an  a-transltion  from  the  initial  (X^,  f)  state  with  lookahead 
J  k 

"J  /k;  so  since  M  is  deterministic,  there  can  be  no  such  f-transition. 

Thus  we  know  that  after  reading  «^,  M  will  be  in  the  initial  (Xn>  p)  state, 

i  k 

with  the  remaining  input  a  H  Then  the  a-transition  from  that  state  will 

be  followed,  reading  a  onto  the  stack,  and  achieving  the  desired  configuration. 

Ca3e  IT.  Assume  that  the  last  rule  applied  it.  the  derivation  is  of  the  form 

(X,  to)  -♦  (Y,  c)  (X,  to  Y).  Then  we  have  (S^  f)  |  (X^  tp^)  —  (Xj ,  tp^  £ 

* 

w  (Y,  p)  (X  ,  fp  Y). . .  (X. ,  tp  )  f  u>  u>  We  want  to  shew  that  H  enters  the 
1.  nn  l  i  L  l  ^ 

following  configuration: 


A 

n+1 

f 

A 

CP 

n 

1  n 

a  i 
n~l 
• 

Vl 

• 

h _ 

• 

*1 

*2  H 


k 
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By  inductive  assumption,  we  know  that  one  configuration  achieved  by  M  after 
reading  is  the  following: 


u  H 
2 


,  where  q  is  the  state  of  M. 


First  let  us  observe  that  <p  /  p.  Since  (X  ,  cp  )  -♦  (Y,  f )  (X  ,  c p  Y)  is 

n  u  n  n  n 

a  rule  of  T  (G),  some  (X  ,  tp  )  state  is  a  base  with  the  (Y,  p)  initial  state 
M  n  n 

an  associated  initial.  But  a  (X^,  f)  state  is  either  itself  an  initial  state, 
or  a  final  state — neither  of  which  can  be  a  base. 

The  first  thing  we  have  to  show  is  that  q  is  the  non-final  state  named 
(X  .  <P  ).  This  proof  is  similar  to  that  of  Case  I.  Let  »  =  T  a  and  let 
the  known  earlier  top  stack  level  be  t  q'  a  a.  The  last  previous  action 

of  M  caused  it  to  follow  the  a  transition  from  q'  to  q',  where  the  lookahead 
was  H  k/k.  By  Lemma  4.10,  since  (X^,  T  a)...(X^,  <p^)  ^  W2  ^ 

beginning  with  the  rule  (X^,  to)-*  (Y,  p)  (X^,  t  aY),  there  is  a  transition 

Hk 

^  /k,  to  the  non- final  state  named  (X^,  t  o). 

Thus  q  is  this  non-final  (X^,  t  a)  state. 

We  have  that  (Y,  p)  'X^,  t  aY)...(X^,  Cp^)  H  k  ==  u>2  k.  Then  by  the 
first  clause  of  Lemma  4.10,  there  is  a  predictive  transition  from  some  base 

/k.  Since  (X^,  tp  )  -» 

(Y,  p)  (X  ,  rp  Y)  is  a  rule,  the  non-final  (X  ,  rp  )  state  must  be  the  base 
n  n  n  n 

state  for  the  (Y,  c)  initial  state. 
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Let  us  recapitulate.  We  know  that  after  reading  uy  k  will  at  some  time 

be  in  state  q,  the  non-final  (X^,  state,  with  the  remaining  input  being 

u,,  H  \  Furthermore,  we  know  that  there  is  a  predictive  transit^  from 

i  k 

that  (X^,  CpQ)  state  to  the  (Y,  f)  initial  state,  on  lookahead  “j  /k. 
Therefore  we  can  be  sure  that  in  the  configuration 


“2  ^  ' 

M's  action  will  be  to  jump  to  the  predictive  state,  causing  a  new  stack  level 
to  be  created,  and  leaving  M  in  the  required  configuration. 


Case  III  Here  we  assume  that  the  last  rule  in  the  derivation  is  of  the  form 

(X,  cp)  "♦  (X,  QJ ' ) .  That  is,  for  some  a,  0,  and  A,  we  have  that  cp^  =  a  p  and 

m  * 
that  (S1,  r)  g  y  (Xn,  a  p)...(Xr  Cp1)  *Uj_  (Xn,  a  A)...(Xx,  cp1)  £  cy 

We  know  by  hypothesis  that  M  will  reach  the  configuration 


K 

U2H 

It  remains  to  show  that  q  is  the  final  state  corresponding  to  the  rule  A  -*  p 
of  G. 


If  a  represents  the  last  symbol  of  8,  we  can  set  a  p  =  tg.  Then  the  top 

k  * 

level  of  M’s  stack  is  A  T  q'  (J  q.  Now  we  have  (X  ,  t  c)...(X  ,  cp  )  — j  ^ 

n  n  1  1  L 
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tf 

i*>2  H  ,  where  the  first  rule  applied  is  (Xn,  a  p)  -*  (X^,  a  A).  Tken  by 

Lenma  4.10,  there  is  a  transition  on  a  from  q'  to  the  final  s* ->te  for  A  -*  p, 
i  k 

with  lookahead  o  -\  /k.  Now  the  last  step  performed  by  M  involved  follow- 

ing  the  o  transition  from  q'  with  lookahead  u>2  H  /k;  therefore  q  is  the  final 
state  for  A  -*  p,  and  M  will  perform  the  reduction,  leaving  the  required 
configuration.  (Note  that  this  works  as  well  for  the  case  that  P,  and  hence  o, 
is  £.) 


Case  IV  We  assiane  that  the  last  rule  applied  in  the  derivation  is  (X,  X)  -* 


m 


That  is,  (Sr  (=)  £  (Xn,  Xn)  (Xn_1,  <?n_1)...  (Xj,  «P1>  ^  1  l(Xn-l»  ®n-l)' 

* 

(X^,  ^  w  io0.  The  known  configuration  of  M  is 


n 


Xn« 


9, 


“•T 


we  must  show  that  q  is  the  rOP  state. 


This  case  is  easy.  We  have  that  (X  ,  X  )  (X  ,  ,m  ,  )...(x,  ,m)H  f  (X  cp  ) 

n  n  n-  i  ti-i  11  L.  n*  1  n~l 

k  *  k 

L  2 


...(x1,  cdx)  H  r  ^2  *v.  flien  by  Lemma  4.10,  there  is  an  X^- transit  ion 

i  k 

from  the  (X^,  f)  initial  state  to  the  POP  state,  on  lookahead  *  / k.  Now 

the  top  level  of  the  stack  is  A  q'  X  q,  where  q'  is  the  (X  ,  r)  initial 

n  n  n 

state.  The  entrance  to  q  was  made  by  following  an  X^-transition  with  lookahead 
i  k 

to  2  -J  /k  from  the  (X^,  c)  initial  state.  Therefore  q  is  the  POP  state,  and 
M  will  suspend  the  ton  level,  leaving  the  desired  configuration. 

This  is  the  final  case  for  our  induction  argument,  and  so  the  proof  is 


done. 


Q.  E.  D. 


170 


Lenina  4,11  has  a  number  of  Interesting  consequences^  but  the  most  impor¬ 
tant  is  the  following. 

Theorem  4.12  Let  H  be  a  cycle-free  MSP(k)  machine  for  G,  TU(G)  the  derived 
grammar.  Then  TM(G)  is  strong  LL(k),  if  k  >  0;  if  k  =  0,  T^G)  is  LL(1). 

Proof  We  use  the  definition  of  strong  LL(k)  given  by  Rosenkrar.tz  and 
Stearns  [13  ] :  G  is  strong  LL(k)  if  end  only  if  given  a  word  to  fVT  and  a 
nonterminal  A,  then  there  is  at  most  one  production  p  such  that  for  some  <»> 
to^,  and  in  VT*: 

1)  S  *  UjAo 

★ 

2)  A  =»  to^  beginning  with  an  application  of  production  p 

3)  (o2co,  k  =  'o. 

We  want  to  recast  this  definition  in  terms  of  leftmost  derivations.  It  is 

•ft  "f(  It 

clear  that  S  =*  to^Ato^  if  and  only  if  S  £  l'jAY-,  for  some  Y  such  that  Y  ^ 

*  * 

Furthermore  A  =»  beginning  with  rule  p  if  and  only  if  A  ^  <*>2  beginning 

with  P.  Thus  we  may  give  the  following  characterization:  G  is  strong  LL(k) 

k 

if  and  only  if  given  <o  £  V^,  and  a  nonterminal  A,  then  there  is  at  most  one 
production  p  such  that  for  some  and  u>2  f  V^*  and  Y  £  (V^  U  V^.)*: 

1)  S  |  u^Y 

* 

2)  A  =*  <^2 ,  beginning  with  the  application  of  production  P 

3)  u  £  FIRSTk  (^Y). 
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Let  us  assume  then  that  T„(G)  is  not  strong  LL(k).  Then  for  3ome  w  and 

M  1 

u)2,  and  some  nonterminal  (X,  Cp),  we  have  (S^,  f)  “j  ^  =>  ^  (X,  <p)  ¥  H  and 

•fc  •ff 

(X,  cp)  |»  u2  beginning  with  rule  and  (X,  cp)  ^2  beginning  with  rule 

* 

Let  be  any  string  such  that  Y  l  ^3*  an<*  *et  US  cona^er  t*ie  result 
of  applying  the  string  U)^'J2W3  mac^1^ne  M.  By  Leznna  4.11,  after  reading 

to  ,  the  configuration  of  M  will  be: 


Since  (S^,  f)  =»  ‘‘)1u,2liJ3  ’  M  will  accept  the  string  wjw2uV  Therefore  there 
must  be  some  next  configuration  of  M;  furthermore,  since  M  is  deterministic, 
there  is  only  one  su^h  possible  next  configuration.  But  what  will  it  be? 
Suppose  production  is  (X,  cp)  -*  r^J  then  by  Lemma  4.11,  since  (X,  cp)  |>  u>2 
beginning  with  rrodu.,  'j  ,  the  next  configuration  of  M  should  be: 


or  very  similar  to  it  (if  r^  is  a(X,  ?pa)  then  the  top  stack  level  will  be 

(X,  ©a)  whiie  the  first  symbol  will  be  removed  from  1J2W3^*  Similarly,  if  p2 

★ 

is  (X,  cp)  -♦  r»,  Since  (X,  cp)  f  to,,  beginning  with  p  ,  the  next  configuration 

4  L  *-  / 


should  oe; 
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But  since  and  are  different  productions  with  the  same  left-hand  side.  It 

must  be  that  r^  #  ^  Thus  there  must  be  two  different  r jxt  configurations  for 

M,  which  is  impossible,  since  M  is  deterministic.  Thus  T^CG)  is  strong  LL(k). 

We  observe  that  this  analysis  works  both  for  the  case  where  k  ^  0,  in  which 

case  it  show  that  T^CG)  is  strong  LL(k);  and  for  k  =  0,  where  it  can  be  used  to 

show  that  T  (G)  is  strong  LL(1).  The  difference  for  the  case  where  k  =  0  i3  that 
M 

the  machine,  even  an  MSP(O)  one,  always  needs  to  examine  the  first  symbol  of 
the  lookahead  when  it  reads  it,  in  order  to  decide  which  state  to  go  to.  Hence 
the  derived  gramnar  has  a  similar  property  and  so  is  LL(1).  Q.E.D. 


4.6  Discussion  and  Explanation 

We  are  now  in  a  position  to  look  back  on  what  we  have  done  in  this  chapter 
and  to  try  to  get  some  perspective  on  it.  We  began  with  the  notion  of  a  cycle- 
free  MSP(k)  machine,  an  MSP(k)  machine  that  did  not  have  indefinitely  long  paths 
through  it  and  so  would  never  need  more  than  a  finitely  wide  stack  level  at  any 
point  during  a  parse.  We  then  proceeded  to  name  the  states  of  such  a  machine; 
intuitively,  we  gave  a  state  the  name  (X,  cp)  if  it  could  be  reached  by  a  path 
spelling  cp  from  an  initial  state  corresponding  to  the  prediction  of  an  X.  (In 
this  way,  several  states  could  have  the  same  name  while  any  one  state  might  have 
several  names.)  Since  M  was  cycle-free,  we  could  be  assured  that  there  would  be 
only  finitely  many  such  names  assigned  in  total.  Then  using  these  names  as  a 
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set  of  nonterminals,  we  derived  a  grammar  from  the  machine  M,  and  denoted  it 

T„(G).  The  rules  of  this  grammar  were  of  four  different  types: 
n 

(X,  <p)  -*  a(X,  9a);  (X,  ®)  -♦  (Y,  c)(X,  <pY);  (X,  cpa)  -♦  (X,  cpA) ;  (X,  X)  -+  f. 

The  first  type  of  rule  would  be  created  if  there  was  an  a-transition  in  M  from 

a  state  named  (X,  9);  the  second  if  some  state  named  (X,  9)  were  the  base  for 

the  prediction  of  Y;  the  third  if  some  (X,  9a)  state  was  a  final  state  for 

the  rule  of  G,  A  *♦  Oj  and  the  fourth  for  every  predicted  nonterminal. 

We  note  in  passing  here  that  it  is  only  for  convenience  that  we  have 

described  the  derivation  of  T„(G)  as  a  two-step  process,  first  the  naming  of 

the  states  and  then  the  construction  of  the  rules.  It  should  be  apparent  that 

these  two  steps  could  be  combined  into  one,  that  the  states  could  be  named 

"on  the  fly"  as  the  rules  were  being  generated. 

We  have  established  some  relationships  between  derivations  in  the  grammar 

Tm(G)  and  the  processing  of  strings  by  the  machine  M.  In  Lenina  4.4  we  showed 

that  if  w  was  a  string  accepted  by  the  machine  M,  then  there  was  a  leftmost 

derivation  of  u:  in  Tm(G),  such  that  the  sequence  of  sentential  forms  of  the 

derivation  represented  the  sequence  of  stack  configurations  M  was  passing 

through  while  processing  u.  In  other  words,  the  leftmost  derivation  of  t*>  in 

T^(G)  simulated  the  MSP(k)  parse  of  w  by  M.  The  converse  of  this  was  proved 

in  Lemma  4.11.  There  we  showed  that  if  w  w-.s  a  string  in  L(T^(G)),  then  if 

we  gave  w  to  M,  the  sequence  of  stack  configurations  M  would  go  through  would 

precisely  mimic  the  sequence  of  sentential  forms  in  the  leftmost  derivation 

of  u  in  T..(G).  These  two  lermas  taken  together  show  that  there  is  a  one-one 
M 

relationship  between  leftmost  derivations  in  T^(G)  and  MSP(k)  parses  by  M. 
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This  suffices  to  show  that  L(T^(G))  =  L(M).  Since  M  is  an  MSP(k)  machine  for 
G,  we  already  know  that  L(M)  =  L(G);  thus  we  have  that  the  transformed  grammar 
Tm(G)  generates  precisely  the  same  set  of  strings  as  the  original  grammar  G. 
Furthermore,  since  M  operates  deterministically,  we  were  able  to  deduce  in 
Theorem  4.12  that  the  process  of  leftmost  derivations  in  T^G)  '-as  also  deter¬ 
ministic — in  other  words,  that  T^G)  was  strong  LL(k). 

Actually,  we  showed  that  L(T  (G))  =  L(M)  in  a  slightly  different  manner, 

by  relating  the  meaning  of  nonterminals  in  T  (G)  to  those  in  G.  We  showed  in 

M 

1c 

Lemma  4.6  that  if  (X,  cp)  f  <*-'  In  T  (G),  where  u  is  a  string  of  terminals,  then 

L  M 

★ 

X  |  cpw  in  G.  This  result  is  actually  interesting  in  its  own  right,  because 
it  begins  to  give  us  some  insight  into  the  nature  of  T^G).  Because  of  this 
lemma,  we  can  begin  to  think  of  the  nonterminal  (X,  cp)  in  T^G)  as  generating 
"the  rest  of  an  X  (in  G)  after  cp".  Thinking  in  these  terms,  the  rules 
of  tm(G)  make  a  lot  of  sense.  For  example,  (X,  cp)  -♦  a(X,  cpa)  means  that  one 
possibility  for  the  rest  of  an  X  after  cp  is  an  a  followed  by  the  rest  of  an  X 
after  cpa;  which  indeed  makes  sense,  if  we  think  of  X  generating  (in  G)  cp  a  T 
for  some  T.  Then  the  rest  of  the  X  after  the  cp  is  a  T,  while  the  rest  of  the 
X  after  cpa  is  just  T. 

Similarly,  (X,  cp  t)  -*  (X,  cp  A)  just  means  that  the  rest  of  an  X  after 
cp  t  may  be  the  same  as  the  rest  of  an  X  after  cpA.  And  this  is  indeed  reasonable, 
if  A  ■*  t  is  a  rule  of  G.  Or  that  (X,  X)  -*  c  ju3t  means  that  the  rest  of  an  X 
after  an  X  is  just  f. 
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We  do  not  want  to  stretch  this  analogy  too  far,  though  it  i3  useful  for  an 
intuitive  feeling  for  TM(G).  First  of  all,  Lemma  4.6  was  not  an  "if  and  only 

if”;  it  is  not  the  case  that  (X,  cp)  necessarily  generates  all  of  the  strings 

which  might  be  called  the  rest  of  an  X  after  .  !p,  only  that  any  string  which 
(X,  cp)does  generate  may  be  so  described.  In  particular,  (X,  c)  does  not 
necessarily  generate  all  strings  which  X  generates  in  G;  specifically,  (X,  f) 
generates  precisely  those  strings  whose  generations  fro^  X  in  G  begin  with  one 
of  the  X-rules  underlying  an  item  in  the  (X,  £)  initial  state,  This  follows 
directly  from  the  proof  of  Lemma  4.6.  In  particular,  if  X  is  left  recursive 
in  G,  some  of  the  rules  for  X  may  be  "left  back"  in  the  base  state  of  the 
splitting,  and  so  (X,  £)  would  not  generate  any  strings  generated  in  G  using 
these  rules.  By  analogy,  when  an  MSP(k)  machine  enters  a  predictive  state  and 
predicts  that  an  X  will  be  found,  it  is  not  any  X  that  will  be  found  and  that 

will  s'  sfy  the  prediction;  rather,  an  X  whose  generation  begins  or  whose  parse 

ends  with  one  of  the  items  included  in  the  predictive  state.  This  notion  of 

(X,  cp)  as  generating  the  rest  of  an  X  after  cu  extends  to  the  machine  M  also; 

if  in  a  given  configuration  of  M,  the  topmost  stack  level  is  X  cp,  then  the 
portion  of  the  remaini  g  input  that  must  be  read  in  order  to  cause  this  level 

k 

to  be  suspended  is  also  a  string  t*>  such  that  X  rf  cp  w  in  G.  That  is,  a  string 

K 

which  is  the  rest  of  an  X  after  cp  will  complete  the  parse  at  that  level. 

There  is  one  other  important  piece  of  information  that  can  be  gleaned  from 

★ 

Lemma  4.6  and  from  the  other  main  lemmas.  We  know  that  if  (S,  £)  f  w  in 

k 

then  S  ^  w  in  G.  But  wa  know  even  more  than  this;  we  know  how  to  reconstruct 

the  tree  for  m  in  G  from  the  tree  for  cj  in  T_,(G).  Let  <*'  be  a  string  in  L(G), 

n 
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and  consequently  In  L(M)  and  L(T^(G)).  We  know  that  applying  M  to  to  produces 
the  hR(k)  parse  of  <o  according  to  G;  that  Is,  the  order  In  which  reductions  are 
performed  by  F  are  the  sane  .n  which  they  would  be  performed  by  the  LR(k) 
machine  for  G  doing  a  conventional  bottom-up  parse  of  o.  Thus  these  reductions 
can  be  pieced  together  appropriately  tc  form  the  tree  for  tJ  in  ^ 

We  have  established  that  there  is  a  relationship  between  the  stack  config¬ 
urations  M  goes  through  while  parsing  to  and  the  sequence  of  sentential  forms 
in  the  leftmost  derivation  of  to  in  T^(G).  In  particular,  performance  of  a 
reduction  by  the  machine  M  causes  the  topmost  level  of  the  stack  to  change 
from  X  CC0  to  X  OtA,  without  an  input  symbol  being  read.  The  only  kind  of 
rule  in  ^(G)  that  effects  a  similar  transformation  on  the  sentential  form  is 
the  rule  (X,  Ocp)  -»  (X,  CCA).  Thus  in  order  to  reconstruct  the  LR(k)  parse  of  w 
in  G  from  the  leftmost  derivation  of  to  in  T„(G) ,  we  may  perform  the  following 
procedure.  The  leftmost  derivation  of  w  in  1^(0  is  a  series  of  rules  of 
T  (G),  r  r  ...r  .  Cnsider  the  subsequence  of  this  list  which  consists  just  of 
those  rule3  of  the  form  (X,  ap)  -♦  (X,  CLA) .  From  this  subsequence,  form  the 
corresponding  list  of  rule3  of  G,  A  -*  p.  This  list  will  give  the  order  of 
reductions  performed  by  an  LR(k)  parse  of  w  -  i.e.,  the  list  is  the  bottom-up 
parse,  and  can  be  used  to  construct  the  tree  for  w  in  G.  Fvrthermore,  the 
sequence  of  rules  of  T^(G)  used  in  the  leftmost  derivation  of  w  is  precisely 
the  information  provided  by  the  deterministic  top-down  parse  of  w,  i.e.,  the 
LL(kl  parse.  We  may  summarize  all  this  information  in  the  following  statement. 

Theorem  4. 13  Hie  LR(k)  parse  for  u  in  G  can  be  reconstructed  from  the  LL(k) 
parse  of  u>  in  TM(G). 
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Figure  4. 11 
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Figure  4.12 


Let  us  give  an  example  of  how  this  reconstruction  is  performed.  Recall 
the  grammar  G  and  its  associated  cycle-free  MSP(l)  machine  shown  in  Figure  4.2, 
with  the  derived  grammar  T^(G)  shown  in  Figure  4.5.  Consider  the  string 
aaacxxyx;  the  syntax  tree  for  this  string  in  G  is  shown  in  Figure  4."  ,  while 
the  one  for  T^(G)  is  given  in  Figure  4.12. 

The  following,  then,  is  the  sequence  of  rulea  of  T^CG)  which  is  the  left¬ 
most  derivation  of  w;  (S,  <r)  -*  a(S,  a);  (S,  a ,  -*  (B,  €)  (S,  aB); 

(d,  c)  “*  a(B,  a);  (B,  a)  -♦  (B,  (B,  aB);  (B,  c  -*  a(B,  a);  (B,  a)  -*  c(B,  ac) 

(B,  ac)  -♦  (B,  A),  (B,  A)  -*  x(B,  Ax);  (B,  Ax)  ■»  (B,  B);  (B,  B)  -*  €; 

(B,  aB)  -♦  (B,  A);  (B,  A)  x(B,  Ax);  (B,  Ax)  -♦  y(B,  A*ry);  (B,  Axy)  (B,  B); 

(B,  B)  -♦  (S,  afl)  -♦  (S,  A);  (S,  A)  -♦  x(S,  Ax);  (S,  Ax)  -♦  (S,  S);  (S,  S)  -*  C. 

The  subsequence  of  rules  of  the  form  (X,  Clp)  -»  (X,  OA)  is:  (B,  ac)  (B,  A); 
(B,  Ax)  -♦  (B,  B);  (B,  aB)  (B,  A);  (B,  Axy)  ■+  (B,  B);  (S,  aB)  -♦  (S,  A); 

(b,  Ax)  -*  (S.  S)«  The  sequence  of  rules  cf  G  that  can  be  derived  from  this 

sequence  ia:  A  **  ac,  B  Ax,  A  -*  aB,  B  ■*  Axy,  A  -*  aB,  S  -*  Ax;  and  this  is 
precisely  the  left-to-right ,  bottom-up  sequence  of  rules  used  in  the  tree  for 
w  in  G. 

We  thus  see  that  theie  is  no  essential  loss  of  information  in  going  from 
G  to  T^(G) 

inspection  of  the  two  preccdiug  figures  indicates  that  there  can  be  great 
structural  differences  between  these  two  trees.  In  OTder  to  try  to  understand 
the  transformatio-  _-ffect  of  TL.  on  trees,  let  us  attempt  another  view  of  the 


that  t  ,ie  G  tree  can  be  recovered  from  the  T^(G)  tree.  But  a  casual 


grammatical  transformation. 
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Suppose  M  were  a  cycle-free  LR(k)  machine  for  G,  a  machine  without  cycles 
and  without  initial  predictive  states  (other  than  the  starting  state).  Then  it 
is  well  known  that  the  language  M  accepts  is  regular,  that  M  has  no  mora  power 
than  a  finite-state  machine.  Since  M  is  cycle-free,  M's  stack  can  only  grow 
to  a  finite  bounded  depth,  so  the  information  that  would  be  contained  on  M's 
stack  can  be  coded  into  the  state  structure  of  the  equivalent  finite-state 
machine.  Alternatively,  theie  will  be  a  right* linear  grammar  whose  nonterminals 
stand  for  the  states  of  the  FSM  and  whose  derivations  simulate  the  actions  of 
this  machine  and  hence  the  successive  conditions  of  M's  stack.  And  this  is 
exactly  wnat  our  grammar  T^(G)  would  be  in  this  case.  It  would  basically 
have  two  kinds  of  rules,  either  of  the  form  (S,  cp)  -*  a(S,  cpa)  or  (S,  a0)  -»  (S,  aA) , 
plus  in  addition  the  single  rule  (S,  S)  -»  p  .  (This  would  be  all,  since  there 
are  no  predictions  in  M. )  This  grammar  would  indeed  be  right- linear;  and  the 
second  part  of  a  nonterminal  name  could  quite  clearly  be  seen  as  representing 
the  progress  of  M's  processing.  The  rule  (S,  $)  -*  a(S,  <pa) ,  when  used  in  a 

derivation,  means  that  if  M  had  cp  on  the  stack  with  a  as  the  next  input  symbol, 

then  M  would  read  a  onto  the  stack,  /xid  (X,  CC0)  -*  (X,  <XA)  means  that  M,  with 
<10  on  the  stack,  would  decide  to  reduce  0  to  P .  In  other  words,  the  dptails 
and  nature  of  the  grammar  G  get  somewhat  garbled  in  transforming  to  TM(G);  it 
is  primarily  the  structure  of  the  machine  M  derived  from  G  that  influences 

the  structvire  of  T  (G). 

M 

When  we  consider  M  as  a  general  cycle-free  MSP(k)  machine  for  G,  much  of 
this  approach  is  still  valid.  It  is  no  longer  true  that  TM(G)  will  be  regular; 

in  general,  there  will  be  .  -linear  rules  like  (X,  ->  (Y,  f)  (X,  cpY).  But 

we  can  think  of  1^(0)  as  being  "almost”  r-^ulnr,  of  being  composed  of  a  numoer 
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of  smaller  almost  regular  grammars,  each  with  a  different  sentence  symbol  (X,  €). 
All  the  nonterminals  (X,  cp)  for  a  given  X  may  be  thought  of  as  keeping  track  of 
the  status  of  one  stack  (level),  of  bounded  depth,  in  such  the  same  way  that 
©)  kept  track  of  rhe  whole  stack  in  the  t^wle-free  case.  It  is  only  the 
predictive  rules  that  disturb  the  regularity  of  this  picture;  these  are  the 
rules  that  correspond  to  starting  new  stack  levels  in  the  more  complicated 
MSP(k)  machine  structure,  and  that  serve  to  glue  together  the  various  sub- 
grammars,  each  of  which  suffices  for  a  single  .‘.tack  level.  And  the  way  in 
which  we  defined  state-splitting  was  designed  in  such  a  way  that  the  top-down 
parser  would  always  know  when  to  switch  to  a  new  subgranmar ,  because  the 
machine  would  deterministically  know  when  to  start  a  new  level.  The  two 
conditions  -  cycle-free  and  MSP(k)  machine  -  ensure  two  different  properties 
of  the  machine  that  make  the  derived  granmar  desirable.  The  first  assures  us 
that  a  finite-state  gr-ramar  will  suffice  to  describe  the  actions  of  any  one 
stack  level,  while  the  stcond  ensures  that  the  decision  on  which  sub-grainmar 
to  use  is  determinable  in  advance.  The  LL(k)-ness  of  T_,(G)  derives  from 
the  fact  that  T_,(0)  is  basically  a  number  of  finite-state  grammars  cleverly 
pasted  together,  so  that  one  grammar  knows  when  to  turn  control  over  to  another. 

With  all  this  in  mind,  we  can  begin  to  think  about  the  effect  of  T^  on 
trees.  Since  T^(G)  is  LL(k)  even  if  G  is  not,  we  can  be  sure  that  T^  will 
change  the  orientation  of  the  trees,  replacing  left  recursion  in  G  by  right 
recursion  in  T^CG).  But  the  effect  is  far  more  than  that.  For  a  given  stack 
level  in  M,  i.e.,  an  (X,  f)  node  in  the  T^(G)  tree,  the  tree's  structure  mimics 
the  activities  of  M.  Thus  die  free  will  look  locally  like: 


•  ft  • 


s 


The  bifurcating  branches  are  tor  rules  like  (X,  cp)  a(X,  epa),  while  the  single 
branches  are  for  (X,  O'?)  -*  (X,  OCA).  This  represents  the  activities  of  M: 
read,  read,  until  it's  time  for  a  reduction,  then  reduce,  then  read  again. 

The  full  tree  is  obtained  by  pasting  a  number  of  these  local  trees 
together.  That  is,  occasionally  the  branching  is  for  a  rule  (X,  cp)  -*  (Y,  €) 

(X,  epY) ;  in  that  case,  the  left  node  of  the  branching  is  just  the  root  node 
of  another  such  local  tree.  It  is  in  this  way  that  the  tree  at  large  is 
constructed  from  smaller  sub-trees.  The  tree  of  Figure  4.12  is  made  up  of 
three  subtrees,  with  roots  (S,  €),  (b,  O,  and  (B,  f) ,  as  shown  in  Figure  4.13. 
If  we  label  these  tries  1,  2,  and  3,  the  figure  indicates  how  they  fit  together, 
how  the  root  of  cue  is  effectively  a  leaf  of  another. 
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4 . 7  Compiling  Power  of  T.,(G) 

w 

While  it  is  certainly  gratifying  to  know  that  1^(0)  generates  the  same 
language  as  G  and  is  LL(k)  to  boot,  thi3  information  by  itself  is  insuffi¬ 
cient  to  prove  the  value  and  worth  of  the  transformation.  For  we  are  not 
interested  only  in  the  problem  of  pure  parsing,  but  in  the  rather  more  com¬ 
plex  issue  of  compiling;  and  the  fact  that  T..(G)  can  be  deterministically 
parsed  in  a  top-do  l  fashion  is  jnly  one  aspect  of  this  problem.  We  must 
convince  ourselves  that  T^(G)  i*.  at  least  as  useful  as  G,in  terms  of  di¬ 
recting  the  compilation  of  sentences  of  l(G).  That  is,  we  wish  tc  show  that 
all  transformations  on  the  input  performed  by  a  compiler  which  is  driven  by 
an  LR(k)  parser  for  G  can  also  be  effected  by  some  compiler  driven  by  an 
LL(k)  parser  for  T^fG).  This  issue  is  particularly  sensitive  since  we  have 
just  seer,  how  can  distort  the  ohrase  structure  of  a  sentence  in  the  lang¬ 
uage;  we  must  be  reassured  that  this  distortion  is  not  so  violent  as  to  make 
the  new  grammar  unusable  in  compilation.  We  are  of  course  referring  to  a 
single-pass  compilation  process  here,  where  the  parser  makes  occasional 
calls  on  "semantic  routines"  which  produce  as  output  seme  representation  of 
the  meaning  of  the  program  being  parsed;  since  it  is  possible  to  reconstruct 
the  G-tree  for  a  string  from  its  I,(G)-fee ,  any  multipass  compilation  scheme 
based  on  G  car.  be  simu'  ted  by  a  similar  scheme  based  on  T^(G). 

The  most  popular  formal  model  for  the  compilation  process  is  that  of 
syntax-directed  translations,  as  developed  by  Lewis  and  Stearns  [13  ).  We 
briefly  review  their  terminology. 

Definition  4. 14  A  translation  grammar  based  on  the  grammar  G  is  a  triple 
'G,  Vj'i  E)»  <*here  is  a  set  of  output  terminal  symbols  disjoint  from 
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V^,,  and  g  Is  a  mapping  which  takes  the  right-hand  side  of  the  rule  A  -»  ct  of 
G  info  a  string  g(A,  a)  over  (V  U  VT')*  such  that  the  nonterminals  of 
g(A,  a)  are  some  permutation  of  the  nonterminals  of  a.  We  refer  to  g(A,  a) 
as  the  translation  element  for  the  rule  A  -*  a. 

We  shall  usually  represent  a  translation  grammar  by  writing  the  trans~ 
lation  element  for  a  rule  in  braces  next  to  the  rule  itself.  For  example, 

A  -♦  aBC(xCyB)  might  be  one  rule  of  a  translation  grammar. 

Definition  4  15  If  for  each  rule  A  -♦  a  of  G,  the  nonterminals  of  a  appear 
in  the  same  order  in  g(A,  a)  as  they  do  in  a,  then  (G,  VT  »  g)  is  called  a 
simple  translation  gramnar.  If  the  nonterminals  of  g(A  a)  appear  to  the 
left  of  th.;  terminals  of  g(A,  a),  then  (G,  V  \  g)  is  a  simple  Polish  trans¬ 
lation  grammar. 

Thus  A  ->  aBCfxEyCz)  is  not  a  simple  Polisu  rule  (though  it  is  simple) 
while  A  -»  aBC(BCz)  is  simple  Polish. 

The  purpose  of  a  translation  granmar  is  to  specify  for  each  string 
cs  £  L(G)f  a  string  w*  over  (V,^/)*,  which  will  be  called  the  Translation  of  u>; 
this  represents  the  alternative  form  of  w  into  which  "e.  wish  to  compile  it. 
The  translation  elements  specify  how  u *  is  defined  in  terms  of  u,  as  fol¬ 
lows. 

Definition  4.16  If  (G,  v,j/»  g)  is  a  translation  grammar,  then  the  associated 
grammar  G*  has  nonterminals  V  ,  terminals  and  its  rules  are  given  by: 

if  A  ^  a  is  a  rule  of  G,  then  A  -»  g(A,  a)  is  a  rule  of  g'. 

Definition  4.17  Every  derivation  in  G  has  an  associated  derivation  in  G  , 
which  is  obtained  by  substituting  associated  rules  for  corresponding  non¬ 
terminals.  If  u>  is  in  L(G),  then  the  derivation  in  G*  associated  with  the 
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the  derivation  of  w  in  G  generates  a  terminal  string  over  (VT#)*,  called 
the  translation  of  w. 

This  last  definition  can  he  made  more  precise  and  formal  (see  [1  i,  p. 
219),  but  for  our  purposes  this  specification  will  suffice. 

Definition  4.18  Let  (G,  V^,  ,  g)  be  a  translation  grantrnr.  Then  (  (x,y)  | 
x  £  L(G)  and  y  is  the  translation  of  x)  is  the  translation  defined  b'1  the 
translation  grammar. 

We  note  that  we  will  be  restricting  our  attention  to  LR(k)  grammars, 
which  are  unambiguous;  hence  every  string  will  have  a  unique  translation. 

The  translation  of  a  string  may  be  interpreted  either  as  the  repre¬ 
sentation  of  the  meaning  of  that  string  in  terms  of  some  intermediate  lang¬ 
uage,  or  as  a  sequence  of  calls  to  semantic  'action  routines"  which  will  per¬ 
form  the  appropriate  compilation  activities  for  that  string,  or  even  as  some 
mixture  of  the  two  concepts. 

This  model  has  proven  to  be  an  understandable  and  useful  model  of  com¬ 
piling.  Writing  a  translation  element  for  each  rule  can  frequently  be  a 
convenient  and  Simple  method  of  specifying  in  a  local  way  the  transformation 
which  is  to  be  performed  by  a  compiler  for  the  language;  for  each  rule, 
the  translation  elanent  describes  what  actions  are  to  be  taken  during  and 
after  the  recognition  of  that  rule.  The  question  then  naturally  arises  as  to 
whether  and  how  the  translation  specified  by  a  translation  grammar  can  be 
actually  implemented  by  some  formal  automata- theoretic  model.  The  kind  of 
model  that  has  customarily  been  used  for  these  purposes  is  the  (deterministic) 
pushdown  transducer  [  1,13]  ,  which  is  basically  a  pushdown  machine  with  one 
added  feature:  namely,  when  making  any  move  the  transducer  can  emit  as  out- 
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put  some  string  of  symbols  from  some  new  output  vocabulary.  This  model  is 
more  fully  defined  in  ( 1  , 2  J3  ] .  We  shall  briefly  restate  some  of  the 
principal  results  pertaining  to  this  model,  with  reference  to  translations 
defined  on  LL(k)  and  LR(k)  granmars. 

Definition  4.19  A  pushdown  transducer  performs  (or  implements)  a  translation 
grammar  based  on  G  if  the  machine  accepts  precisely  L(G)  and  if  in  accepting 
w  £  L(G)  it  produces  as  output  the  translation  of 

Theorem  A. 20  Any  simple  translation  based  on  an  LL(k)  grammar  G  can  be 
performed  by  a  deterministic  pushdown  transducer. 


Sketch  of  proof:  Since  G  is  an  LL(k)  gramnar,  it  can  be  recognized  by  some 

deterministic  pushdovm  acceptor.  We  modify  this  acceptor  in  the  following 

way.  Consider  any  rule  of  G.  A  .»  a.  B,  a,  B,  ...  a  B  a  where  each 

’  112  2  n  n  n+1 

a^  is  either  f  or  a  string  of  terminal  symbols;  the  general  simple  transla¬ 
tion  element  that  can  be  associated  with  this  rule  is  [x^  B^  ... 

x  B  x  . , 1  ,  where  each  x,  is  some  string  of  output  sirobols.  Now  for 
n  n  n+1  i  or 

some  configurations  of  the  machine,  with  A  on  top  of  the  stack  and  some 


lookahead,  the  machine  will  specify  that  A  is  to  be  replaced  by  a^  B^  ... 
B^  an_m  on  the  stack.  We  change  these  entries  in  the  acceptor's  state 


table  so  that  A  will  be  replaced  on  the  stack  in  these  cases  by  a^  B^ 

a_  x„  B„  ...  a  x  B  a  ,,x  where  x.  .  s  a  new  symbol  unique  for  x. . 

2  2  2  n  r  n  n+1  n+i  i  i 

We  further  modify  the  machine  so  that  when  is  on  the  top  of  the  stack,  it 
is  popped  off,  and  x.^  is  emitted  as  output. 


A  related  statement  pertains  to  the  capability  of  translating  LRfk) 


grammars. 
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Theorem  4. 21  Any  simple  Polish  translation  based  on  an  LR(k)  gramnar  G 
can  be  performed  by  a  deterministic  pushdown  transducer. 

Sketch  of  proof:  Since  G  is  an  LR(k)  grammar,  it  can  be  recognized  by 

some  deterministic  pushdown  acceptor.  We  shall  modify  this  acceptor  as 

follows.  If  A  •*  a,  B.  ...  B  a  , ,  is  a  rule  of  G,  the  most  general  simple 
11  n  n+1  *  r 

Polish  translation  element  for  this  rule  is  f B,  . . .  B  x} .  We  then  alter 

1  n 

the  acceptor  so  that  when  it  reduces  a^  B,  ...  an+^  to  A,  by  popping  the 
former  off  the  stack  and  replacing  it  by  A,  it  also  emits  x  as  output. 

There  is  a  converse  to  this  latter  result,  namely  that  any  translation 
performed  by  a  deterministic  pushdown  transducer  can  be  specified  as  a  simple 
Polish  translation  on  some  m(l)  grammar. 

These  results  can  be  interpreted  on  an  intuitive  basis.  If  G  is  LL(k), 
then  can  be  parsed  deterministically  top-down,  and  so  the  right-hand  side 
of  a  rule  can  be  identified  before  it  has  been  read;  thus  the  identity  of 
the  output  symbols  that  have  to  be  emitted  for  that  rule  can  be  determined  in 
advance,  and  some  output  symbols  which  are  peculiar  to  that  rule  can  be 
emitted  before  the  entire  right-hand  side  has  been  read.  For  an  LR(k)  gram¬ 
mar,  however,  a  rule  is  only  identified  once  its  end  has  been  reached; 
while  output  symbols  that  call  for  emission  at  the  end  of  the  rule  can  be 
safely  determined  at  that  time.  It  is  too  late  by  then  to  emit  symbols 
that  should  have  been  put  out  earlier. 

These  results  can  also  be  misinterpreted  and  be  made  to  seem  much  more 
significant  than  they  actually  are.  It  has  sometimes  been  said  that  these 
results  imply  that  the  most  general  kind  of  translation  that  can  be  per¬ 
formed  by  a  translator  based  on  an  LL(k)  (LR(k))  parser  for  u  are  those  that 


can  be  described  by  simple  (simple  Polish)  translation  grammars  based  on  G. 

This  is  not  the  case,  and  onlv  appears  to  be  the  case  because  of  the  paucity 
of  the  translator  model  usually  employed.  Sometimes  these  results  are  used 
as  an  argument  for  preferring  LL(k)  parsing  of  a  particular  granmar  G  to 
LR(k)  parsing  for  G,  the  argument  being  that  more  translations  expressed 
on  G  can  be  implemented  by  a  translator  based  on  an  LL(k)  parser  than  by  one 
based  on  an  LR(k)  parser  for  G.  This  statement  is  simply  not  the  case.  Aho 
and  Ullman  have  shown  in  a  recent  paper  [2  ] ,  that  any  simple  translation  on 
an  LL(k)  granmar  G  can  be  performed  by  a  translator  based  on  the  LR(k)  par¬ 
ser  for  G.  In  other  words,  while  any  simple  Polish  translation  on  an  LR(k) 
grammar  can  be  performed  by  a  deterministic  pushdown  translator,  it  may 
be  the  case  that  some  non-simple  Polish  translations  can  also  be  performed 
by  a  translator  based  on  the  LR(k)  parser  for  G. 

In  order  to  focus  clearly  on  the  appropriate  issues  of  translations,  let 
us  reestablish  the  kind  of  context  in  which  our  transformation  might  be 
applied.  The  scenario  is  that  a  languag  designer  has  specified  an  LR(k) 
grammar  for  his  language,  which  reflects  his  conception  of  the  language's 
primitives  and  features;  and  that  he  also  has  defined  the  semantics  of  the 
language  In  compilation-oriented  terms,  that  is,  by  means  of  a  translation 
grammar  or  its  equivalent.  Now  he  would  like  to  obtain  ar.  equivalent  granmar, 
but  one  more  suitable  for  use  in  a  compiler;  an  LL(k)  grammar,  for  which 
a  compiler  can  be  cleanly  and  efficiently  designed  and  implemented.  So  he 
supplies  his  granmar  to  a  transforming  program  which  finds  a  cycle-free 
MSP(k)  machine  for  the  gramr  r  (as  described  in  the  next  chapter)  ,  and  which  then 
derives  an  LL(k)  grammar  from  that  machine.  However,  we  want  to  be  sure 
that  a  compiler  which  uses  this  new  LL(k>  granmar  can  implement  the  kind  of 
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compilation  activities  which  the  designer  specified  for  his  original  gram¬ 
mar.  In  particular,  even  if  the  original  translation  grammar  were  not  simple 
Polish,  but  could  nevertheless  have  been  performed  by  a  compiler  which  used 
an  LR(k)  parser  for  the  original  grammar,  then  it  behooves  us  to  demonstrate 
that  such  a  translation  can  also  be  effected  by  some  compiler  using  the  new 
grammar.  Therefore  we  want  to  look  more  closely  at  the  kinds  of  translations 
describable  on  LR(k)  grammars  that  can  be  implemented  by  compilers  using  LR(k) 
parsers;  these  are  the  kinds  of  translations  we  want  to  be  able  to  implement  on 
a  compiler  based  on  an  LL(k)  parser  for  the  transformed  grammar.  Thus  while  it 
would  be  straightforward  to  show  that  every  simple  Polish  translation  of  G  has  an 
equivalent  simple  translation  on  Tj^C) ,  this  is  only  part  of  what  we  want  to  establish. 

In  order  to  minimize  (hopefully)  confusing  terminology,  from  now  on  we 
shall  use  the  term  "compiler"  rather  than  "translator,"  to  refer  to  a  machine 
model  that  parses  and  emits  output.  A  compiler  will  be  based  on  a  particular 
parser  if  it  emits  output  in  conjunction  with  the  actions  of  that  parser  in 
recognizing  a  string.  We  shall  have  occasion  to  refer  to  compilers  based  on 
LR(k)  parsers  as  well  as  to  com^L lets  based  on  ’tSP(k)  parsers;  rather  than  make 
separate  definitions  for  these  two  cases,  we  make  just  one,  for  the  MSPCk) 
case;  and  the  LR(k)  compiler  is  just  a  special  case  of  that. 

Definition  4.22  Let  M  be  an  MSP(k)  .uachine  for  G.  Then  P,  a  compiler  based 
on  M,  is  a  triple  (M,  5),  where  V  *  is  a  finite  output  vocabulary  and 

k 

5  is  the  output  function,  which  takes  (Q^  U  Q2  U  Q^)  x  (Vjj  U  VT  ^  x  *-nto 

(V^*)*.  A  configuration  of  such  a  translator  is  a  quadruple  (q,  a,  u>,  y) ,  where  the 
first  three  components  are  the  configuration  of  M  and  y  £  (V^1)*  is  the  output 
emitted  so  far.  As  for  successor  configurations,  we  define  (q,  a,  o,  y) 
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V~p  (q'»  a',  (•>',  y')  an  follows:  q',  a',  ana  w*  are  as  given  by  (q,  a,  co) 

'"H  <«’  ,a' ,  co');  If  (q1,  a',  U')  is  a  read-successor  of  (q,  a,  tt),  then  q'  = 

fM(q,  a,  t!  for  some  a  and  t,  so  we  let  y'  =  y  •  6p(q,  a,  rl;  if  (q',  a',  <*>') 

is  a  reduce-auccessor  of  (q,  a,  u>),  then  q'  =  f  (q  ,  A,  t),  for  some  q.  ,  A, 

HI  i 

and  t,  so  we  let  y*  =*  y.  ApCq^,  A,  t);  otherwise  y'  =  y. 

All  this  says  is  that  we  associate  an  output  string  with  every  non- 
predictive  transition  between  states  of  M;  th’t  output  is  emitted  whenever 
that  transition  is  followed,  whether  that  bo  after  a  read  or  a  reduce. 

Definition  4.23  If  P  is  a  compiler  based  on  the  MSP(k)  machine  M,  then  the 

translation  defined  by  P  is  {  (x,y)  such  that  (q^,  S  q^,  x  — |  ,  £)  (POP, 

i  k 

S  q0  S  POP,  -)  y)}.  If  (x,y)  is  in  the  translation  defined  by  P,  then 

x  £  L(M)  and  y  is  called  the  translation  of  x. 

Definition  4.24  If  (G,  V^/,  g)  is  a  translation  granmar,  and  P  is  a  compiler 
for  G,  then  P  implements  (G,  g)  if  for  each  w  £  L(G),  the  translations  of 

co  defined  by  P  and  by  (G,  ,  g)  are  the  same. 

This  general  machine-oriented  notion  of  a  translation  is  only  partly 
helpful.  We  are  not  really  interested  in  the  full  range  of  translations 
that  can  be  computed  by  a  compiler  based  on  an  LR(k)  parser,  because  peo¬ 
ple  do  not,  in  general,  conceive  of  the  meaning  of  a  program  purely  and 
directly  in  terras  of  hew  a  compiler  should  operate  on  it  and  produce  output; 
the  translation  granmar  is  a  much  more  effective  model  for  the  descriotion  of 
the  semantics  of  a  grammar.  We  want  to  characterize  those  translation  gram¬ 
mars  that  can  be  implemented  by  a  compiler.  We  already  know  that  simple 
Polish  translations  on  an  LR(k)  granmar  can  be  implemented  by  a  comoiler 
based  on  the  LR(k)  parser  for  the  gram  .  But  that  is  not  necersarily  the 
only  kind  of  translation  that  can  be  so  implemented.  For  example,  suppose 
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that  A  -*  aBb  was  the  only  rule  of  G  which  contained  the  terminal  symbol  a, 
and  suppose  its  translation  element  is  (x  By}.  Then  even  though  this  is  not 
a  simple  Polish  translation  element,  an  LR(k)-based  comp'.ler  could  nonetheless 
handle  it  correctly;  namely,  on  making  an  a-transition  from  an  LR(k)  state 
containing  A  •*  .  aBb(T),  it  would  emit  output  x.  The  problems  of  course 
arise  if  there  are  rules  like  A  -»  aBb(xBy)  and  A  -*  aBc{uBv};  then  during  a 
bottom-up  parse,  it  really  is  impossible  to  know  whether  to  emit  x  or  u 
on  :he  a  until  the  end  of  the  rule  has  been  reached.  And  of  course,  by 
then  it  is  too  late  to  make  the  decision  and  emit  the  appropriate  output, 
because  by  then  the  output  for  the  B  will  already  have  been  generated. 

We  try  to  capture  the  idea  of  a  compiler  which  is  constructed  from 
the  specification  of  a  translation  grammar  in  the  following  definition. 

Definition  4.25  Let  (G,  ,  g)  be  a  translation  grammar  based  on  the  LR(k) 

grammar  G.  A  compiler  P  based  on  the  LR(k)  parser  for  G  is  designed  to 
implement  (G,  VT‘  ,  g)  if  the  following  two  conditions  hold: 

i)  P  implements  (G,  VT' ,  g) 

ii)  let  A  -♦  aa0  be  any  rule  of  G;  then  there  is  a  string 
y  £  (V  *)*  such  that  if  A  -*  a.a0(T)  is  an  item  of  state 
q  and  f  FIRST,  (0t),  then  5  (q ,  a,  w)  =  y. 

tC  r 

The  meaning  of  this  definition  is  that  each  symbol  of  each  rule  is  to 
have  an  output  associated  with  it,  and  that  when  the  parser  locates  a  symbol 
of  a  rule,  the  compiler  emits  that  symbol's  associated  output. 

The  motivation  for  this  definition  is  that  it  is  too  easy  for  a  random¬ 
ly  designed  compiler  based  on  the  m(k)  parser  for  G  to  "accidentally"  im¬ 
plement  1  translation  grams ar,  without  having  been  explicitly  designed  to 
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do  so.  And  since  such  compilers  may  be  capriciously  constructed  and  may 
often  bear  no  visible  relationship  to  the  translation  they  Implement,  It  Is 
very  difficult  to  discuss,  or  -'ake  any  general  statements  about,  the  full 
class  of  compilers  that  implement  a  particular  translation  grannar.  Further¬ 
more,  we  are  really  interested  only  in  compilers  that  are  designed  by  3peci- 
•.xcation  to  implement  a  given  translation  grammar,  rather  than  those  that  just 
turn  out  to  do  so. 

For  example,  consider  a  rule  of  a  translation  grammar  A  -*  BaDfBXjX^D}. 

It  would  be  possible  for  a  compiler  that  Implemented  this  translation  to  have 
items  A  -*  .BaD(Tj)  and  A-*  in  two  different  states,  and  yet  have 

different  outputs  associated  with  the  B  transltiot  *t  of  fhese  states. 

This  situation  is  illustrated  in  Figure  4.14;  we  h  t  written  the 

lookaheads  on  the  transitions,  but  rather  have  we  indicated  the  outputs 
on  the  transitions. 

It  is  situations  like  this  that  we  wish  to  exclude  for  the  reasons 
described  above.  Note  that  we 


Figure  4.14 


have  not  restricted  the  compiler  model  so  that  if  A  -*  BaDlx.Bx^x.D}  is  * 
rule  of  the  translation  grammar  being  implemented,  then  the  uuCpuc  associated 
with  the  B-transition  out  of  any  state  with  item  A  -»  .BaD(T)  in  it,  must  be 
::2x2‘  Such  a  restriction  would  effectively  say  that  outputs  in  the  com- 
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pller  are  only  associated  with  nonterminals,  which  is  not  necessarily  a  good 
model  of  how  translations  are  designed.  In  particular,  no  such  LR(1)  corn- 
oiler  couli  implement  a  translation  grammar  with  rules  A  BabD(Bx^D)  and 
A  •»  BacDfBXjD}  ,  since  items  A  -»  .  BabDCO  and  A  -*  .Bacl)(T)  wduld  be  in  the 
same  state;  >.nd  then  what  should  the  output  be  on  a  B/a  transition  from  that 
state  -  Xj  or  x 

We  feel  that  our  restricted  class  of  compilers  is  large  isnough  to  ef¬ 
fectively  capture  the  utility  of  the  compiler  model  of  implementing  trans¬ 
lations,  yet  small  enough  to  be  manageable.  Furthermore,  a  little  reflection 
confirms  one's  intuition  that  if  some  compiler  based  on  the  LR(k)  parser  for 
G  implements  (G,  V,j/,  g),  then  there  is  some  compiler  desf tned  to  implement 
the  translation,  at  the  expense  of  a  possible  increase  in  the  value  of  k  used 
by  the  designed  compiler  to  determine  what  outputs  to  emit. 

Since  this  result  is  not  cri¬ 
tical  to  our  work,  we.  "hull  not  pursue  the  laborious  constructions  needed  to 
verify  it;  we  only  mention  it  in  passing,  as  another  justification  for  the 
reasonableness  of  our  model. 

Thus  the  class  of  translation  grammars  that  are  of  effective  use  in 
building  compilers  are  those  that  have  compilers  designed  to  implement  them. 
We  now  wish  to  show  that  our  transformation  prcs^n-es  this  class  and  so 
maintains  the  compilation  power  of  the  underlying  language,  even  using  a  dif¬ 
ferent  grammar. 

Theorem  4. 26  Let  G  be  an  LR(k)  grammar,  (G,  g)  a  translation  grammar 

based  on  G,  and  let  M  be  any  MSP(k)  machine  for  G.  Then  if  there  is  some 
cc  *  ler  Pq,  based  on  the  LR(k)  parser  for  G,  that  is  designed  to  impleimnt 
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(G,  V  *,  {?)»  then  there  also  is  a  compiler  P  based  on  M  designed  to  implement 
(G,  V/,  g). 


Proof  We  oust  define  the  output  function  6^.  This  is  done  as  follows.  Let 

q  be  any  state  of  H;  then  by  Corollary  3,49,  q  is  a  suLstate  of  q*,  3ome 

state  ci  Mg,  the  LR(k)  machine  for  G.  ’’’Sea  if  A  a.  oP(t)  is  an  item  of  q, 

and  w  £  FIRSTLY),  we  define  6p  (q,  a,  u)  as  equal  to  6p  (q*,  a,  <*>). 

The  first  thing  we  oust  show  is  that  6  is  well-defined.  First  of  all, 

P 

if  q  is  a  substate  of  q',  and  if  A  a.  ctP(t)  is  an  item  of  q,  then  it  is  an 

item  of  q*  as  well;  and  so  6  (q *,  a,  w)  is  defined.  Next,  suppose  q  is 

a  substate  of  both  q*  and  q".  Then  A  -»  ot.  cxp(T)  will  be  an  item  of  both  o' 

and  q"t  and  since  the  compiler  based  on  Mg  is  designed  to  implement  (G, 

VT\  g),  we  have  4p(q',  a,  *  6p  (q",  a >  w),  if  w  p.  FIRST^Ot).  Thus 

6  (q,  a,  u)  is  defined  and  single-valued. 

P 

It  is  clear  that  this  new  ccxapiler  satisfies*  the  second  condition  for 
being  designed  to  implement  (G,  VT',  g).  Suppose  A  -*  a.cfKr^)  i  .  an  iton  of 
q^  in  M  and  A  *♦  a.  apC^)  is  an  item  of  q^  in  M,  where  w  f  FIRST^Pt^)  (1 
FTRST^P1^)*  Then  we  must  show  that  6^(q^,  a,  u)  =  6  (q^,  C,  u).  But  q^  C 

q^#  and  q^  c  q2*>  which  are  states  of  Mg.  Then  6  (q^  a,  u)  =  6  (q^*,  a,  ui)  * 

,  .  P  *  0 
5  pQ(q2  ,  o,  u)  ■  ^^2*  a *  U'f’  i:^e  equality  holds  because  the  compiler 

Pq  is  by  hypothesis  designed  to  implement  (G,  Vj’t  g)« 

finally,  we  must  show  that  tho  M-based  compiler  implements  (G,  VT*,  g). 

We  do  this  by  showing  that  it  is  equivalent  to  the  Mg-based  compiler.  We  know 

that  L(M)  -  L(Mq),  so  we  need  only  demonstrate  that  if  it  f  L(M),  then  the 

M-based  compiler  produces  the  same  output  ror  it  as  doa?  the  Mgbased  one.  We  know 

by  3.66  and  3.69  that  M  and  Mg  process  any  string  in  essentially  the  same 
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way;  that  every  time  M  makes  a  c/u  transition  from  a  state  q,  Mg  also  a.xei 
a  o/w  transition  from  a  state  q*  that  contains  q.  (A  crucial  point  to  be 
noted  here  is  that  If  M  makes  an  A/u  transition  to  the  POP  state,  it  will 
make  another  A/w  transition  after  disposing  of  the  topmost  stack  level;  however. 
Mg  makes  only  one  corresponding  A/to  transition,  so  it  might  oe  feared  that 
M  emits  an  extra  output.  But  this  is  not  the  case,  for  if  £^(q,  A,  w)  * 

POP,  then  there  is  no  item  B  -*  a.A0(T)  with  to  £  FIRST^(Pt)  in  q,  so  fip(q, 

A,  to)  will  not  be  defined.)  Thus  M  emits  the  same  outputs  as  Mg  in  processing 
x,  and  in  the  same  order;  furthermore,  M does  not  emit  ar.y  additional  output,  since 
outputs  are  not  emitted  upon  making  a  prediction  or  transferring  to  POP.  '...us 
for  any  x,  M  and  M,  produce  the  sane  outputs,  and  so  P  is  indeed  a  compiler 
designed  to  implement  (G,  V *,  g).  Q.E.D. 


Now  that  »e  have  shown  that  no  real  translating  power  is  lost  by  parsing 

with  M  Instead  of  with  Mg,  we  will  ahow  that  nothing  is  lost  by  parsing  with 

T„(G)  rather  than  by  using  M  for  G. 
n 


Theorem  4.27  Let  M  be  a  cycle-free  MSlT(k)  machine  for  G,  and  P  a  compiler 


based  on  M.  Then  them  is  a  compiler  >ased  on 
which  is  equivalent  to  P. 


the  LL(k)  parser  for  T^G), 


Proof  The  new  compiler  is  specified  in  the  following  way.  Suppose  q  is  some 
non-final  state  of  M  named  (X.  qj),  and  suppose  f^(q ,  o,  t)  is  defined,  with 
CT,  t)  =  y.  Then  if  (X,  ^c)  is  the  top  symbol  of  the  stack  during  an 
LL(k)  parse  using  T^(G),  the  new  compiler  is  to  emit  y  as  output  before  re¬ 
placing  (X,  cpa)  ty  the  appropriate  right-hand  side. 

Since  there  is  only  one  non-final  state  in  M  named  (X,  cp),  it  is  immediate 
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that  this  compiler  is  well-defined.  Now  we  know  chat  L(TM<G))  =  L(M).  Fur¬ 
thermore,  we  know  by  Lemma  4.1i  that  the  LL(k)  parse  of  a  9tring  based  on 
T.U(C)  effectively  simulates  the  processing  of  that  string  by  M.  In  partic- 
ular,  if  a  cr/r-transition  is  ever  followed  out  cf  q  to  »  state  named  (X, 

©a),  then  at  a  corresponding  time  in  the  LL(k)  parse,  (x,  <p<?)  will  be  on 
top  oc  the  stack  with  T  as  the  lookahead.  The  new  compiler  is  designed  to 
emi.c  y  at  that  time,  so  it  has  the  same  effect  as  the  original  one. 

This  sketch  can  readily  be  expanded  into  a  full  proof.  Q.E.D. 

We  make  a  note  about  the  timing  of  these  two  translations.  In  P,  the  out¬ 
put  y  is  emitted  as  the  transition  is  followed  on  a  from  q;  however,  in  the 
compiler  using  T__(G),  the  output  is  not  emitted  upon  recognition  of  the  rule 
(X,  9)  -*  ct(x,  t?cr)  or  (X,  cpd)  -»  (X,  cpc),  whichever  one  >uts  (X,  ©a)  onto  the 
stack;  but  rather  a  little  later,  when  (X,  9a)  already  resides  on  top  of  the 
stack,  and  prior  to  its  replacement  by  some  right-hand  side.  In  a  sense,  it  is 
like  P  emitting  the  output  upon  leaving  state  (X,  ®<j),  rather  than  on  entering 
it.  The  reason  for  this  is  that  there  is  a  difference  in  the  way  fISP(k)bonipilers  and 
LL(k)  compilers  can  use  k  symbols  of  lookahead  to  determine  what  output  to 
emit.  An  MSP(k)  compiler  can  inspect  k  symbols  after  cr,  while  the  LL(k)  com¬ 
piler  can  just  look  an  absolute  k  symbols  ahead  into  the  remaining  input.  In 
the  case  where  a  is  a  terminal  symbol,  this  is  a  real  difference,  and  so  the 
T„(G)  compiler  has  to  wait  until  after  the  o  has  been  read  in  order  to  see  as 
much  as  the  MSP(k)  compiler  could  tohen  it  was  reading  a,  in  order  to  decide 
what  output  to  produce. 

We  can  summarize  all  of  the  foregoing  as  follows. 

Theorem  4. 28  Let  (G,  p)  be  a  translation  grammar  such  that  there  is 


197 


some  compiler  based  on  the  LR(k)  parser  for  G  which  is  designed  to  implement 
(G,  * ,  g).  Then  if  M  is  a  cycle-free  MSP(k)  machine  for  G,  there  is  a 

compiler  based  on  the  LL(k)  parser  for  1^(0)  which  implements  (G,  g)  . 

This  theoraa  follows  immediately  from  the  preceding  results  It  says 
that  any  simple  translation  of  G  that  can  be  Implemented  by  a  naturally  de¬ 
fined  IH(k)  based  compiler,  cm  also  be  implemented  by  a  compiler  based  on 
the  LL(k)  parser  for  T,.(G).  Or  in  other  words,  any  useful  translation  of  G 
can  be  done  by  some  T^G)  compiler, 

'Oils  ie  the  most  useful  form  in  which  to  state,  this  result.  It  would 
profit  us  little  to  find  some  other  formal  characterization  of  this  set  of 
translations  on  G,  for  our  goal  is  not  to  study  in  abstract  the  kind  of  G- 
based  translations  that  can  be  implemented  on  a  TM(G)  compiler,  but  just  to 
demonstrate  that  the  1^(0  compilers  can  handle  anything  that  a  reasonably 
constructed  compiler  for  G  could. 

However,  if  another  characterization  is  desired,  the  rudiments  of  one 
already  exist.  In  a  difficult  section  of  [13],  Lewis  and  Stearns  introduce 
the  notion  of  a  "distinction  index."  Briefly,  the  distinction  index  of  two 
instances  of  the  same  symbol  in  right-hand  sides  of  two  rules  of  a  grammar, 
is  the  amount  of  lookahead  necessary  to  distinguish  between  occurrences  of 
these  instances  in  a  sentential  form.  In  a  roundabout  way,  Lewis  and  Steams 
discuss  a  general  class  of  simple  translations  on  an  LR(k)  grammar  which  can 
be  implemented  by  a  pushdown  translator  using  the  LR(k)  granmar.  The  class 
consists  of  those  translation  grammars  that  satisfy  the  following  property: 
two  instances  of  a  symbol  on  right-hand  sides  of  rules  may  have  different 
outputs  "associated"  with  them  in  the  rules'  respective  translation  elements, 
only  if  the  distinction  index  of  the  two  instances  is  less  than  or  equal  to  k. 
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Of  course  we  have  paraphrased  their  results  in  an  informal  way,  but  an  in¬ 
spection  of  [  13]  will  verify  that  our  class  of  transformable  translations 
is  a  generalization  of  their  class  of  implementable  translations. 

We  T'r'te  that  the  translation  performed  by  the  compiler  based  on  the 
Tj_j(G)  parser  of  Theorem  4.21,  may  not  be  expressible  as  a  translation  grammar 
on  Tj^(G).  But  that  is  of  little  real  significance,  for  we  are  really  oniy 
interested  in  the  compiling  power  of  T^(G)  parsers.  The  LL(k)  grammar 
T^(G)  and  its  associated  parser  and  compiler  need  never  see  the  light  of 
day  in  the  kind  of  automatic  compiler  writing  system  tt  at  this  work  might  be 
applied  to.  To  use  such  a  system,  as  we  have  envisioned  it,  a  programming 
language  designer  would  construct  a  grammar  for  his  language  and  provide  it, 
together  with  translation  elements  specifying  the  semantics  of  each  rule,  to 
a  grammatical  processor.  This  processor  would  convert  the  designer's  gram¬ 
mar  into  an  equivalent  LL(k)  grammar  by  applying  our  transformation,  and  it 
would  also  design  a  compiler,  based  on  the  LL(k)  parser  for  the  transformed 
grammar,  to  implement  the  originally  specified  translation.  This  compiler 
woui  be  used  to  actually  compile  programs  in  the  language;  its  effective 
activities  would  be  described  by  the  original  translation  grarasar;  and  the 
fact  that  there  is  no  compact  way  of  describing  this  compiler's  precise  ac¬ 
tions  by  means  of  a  translation  grammar  on  th>e  transformed  grammar  would  not 
be  so  important. 

However,  for  the  sake  of  ccxnp le ter.es s ,  we  will  characterize  a  class  of 
translations  on  G  that  can  be  described  by  simple  translations  on  T_,(G). 


Theorem  4, 29 
to  implement 


Let  Pq  be  a  compiler  based  on  the  LR(k)  parser  for  G,  designed 

(G,  V  '  g),  and  also  satisfying  the  property:  If  f  (q,  a,  t.) 
L  Mn  1 
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fM  Cq ,  cr,  T-),  then  6  (q,  a,  t,  )  =5  (q,  a,  T  ).  Then  If  M  Is  a  cycle- 

^  t>  1  %  1 

MSP(k)  machine  for  G,  there  is  a  simple  translation  grammar  based  on 

1^(0  which  is  equivalent  to  (G,  V^',  g). 


Pr^of  First  we  define  the  compiler  P  based  on  M  that  is  equivalent  to  P^, 
the  one  basec".  on  this  can  be  done  by  Theorem  4.26.  Then  we  defiue  the 

translatirn  elements  for  T^CG)  as  follows.  For  the  rule  (X  ,  cpcc)  -♦  a(X, 
cpoa)  or  (X,  cpcr)  -*  (Y,  £)  (X,  cpdY),  consider  the  unique  non-final  state  named 
(X,  q?cr> ;  by  hypothesis,  all  entries  tc  that  state  from  q,  the  non-final  (X, 
cp)  state,  have  an  identical  output  y  associated  with  each  one.  Then  the  trans¬ 
lation  elements  for  these  rules  are  to  be  ft,  wc)  "*  a(X,  cpoa)  (y(X,  cpoa)) 
and  (X,  5,0)  •*  (Y,  £)(X,  cpoY)(y(Y,  g)(X,cpoY)}.  For  the  rule  (X,  cpo)  -♦  (\  cp'), 
there  i3  exactly  one  final  state  whose  rule  expresses  the  relationship  be¬ 
tween  cpo  and  cp>;  let  z  be  the  output  associated  with  entry  to  that  state  from 
q*  Then  va  have.  (U,  ^c)  -♦  (X,  cp')  (z(x,  cp'’)).  If  a  rule  is  not  assigned  a 
translation  element  by  these  stipulations,  its  translation  element  is  to  be 
just  the  nonterminals  of  its  right-hand  side.  It  is  immediate  that  this  trans¬ 
lation  grammar  is  well-defined. 

Let  us  design  a  TM^G)-based  compiler  P'  to  implement  this  translation 
grammar  as  follows.  If  (X,  cp)  -*  Y  [y  Y'j  is  a  rule  of  the  translation  gram¬ 
mar,  then  whenever  (x,  cp)  is  on  top  of  the  stack,  with  t  as  the  lookahead, 
where  t  f  FIRSTLY),  then  the  compiler  is  to  emit  y  as  it  replaces  (X,  cp)  by 
Y  on  the  stack  This  compiler  is  well-defined;  since  T^(G)  is  strong  LL(k), 
for  a  given  (X,  cp),  any  T  will  be  in  FIRST^  of  at  most  one  (X,  cp)-rule.  And 
it  is  clear  that  it  docs  implement  this  translation  gramnar. 

But-  as  we  have  seen  in  Lensna  4.10,  t  is  in  FIRST,  of  a  rule  (x,  cpa)  -♦  Y 
only  if  T  is  the  lookahead  on  some  transition  into  an  (X,  cpCT)  state,  which  is 
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non-final  if  Y  is  a(X,  coca)  or  (Y,  f)  (X,  ©ctf),  and  final  otherwise.  By  the 
way  the  translation  granmar  has  been  defined,  the  y  output  associated  with 
all  c/t  entries  into  the  same  state  is  the  output  specified  for  the  appropriate 
(X,  qjcj)  -*  Y  rule(s).  Thus  this  compiler  P*  not  only  implements  the  transla¬ 
tion  grammar,  but  is  also  precisely  the  compiler  described  in  the  proof  of 
Theorem  4.27,  which  is  equivalent  to  the  M-based  compiler  P.  Thus  by  a 
chain  of  equalities,  the  T^.G)  based  translation  grammar  is  equivalent  to 
(G,  VT',  g).  Q.E.D. 


Corollary  4.30  If  M  is  a  cycle-free  MSP(k)  machine  for  G,  then  every  simple 
Polish  translation  on  G  can  be  expressed  as  a  simple  translation  on  T^(G). 


Proof  We  define  an  IH{k)-based  compiler  for  the  simple  Polish  translation 
as  follows.  If  A  -»  Y  {Y*y}  is  a  rule  of  the  translation  grammar,  then  the 
compiler  is  to  emit  an  y  as  output  upon  every  entry  to  the  final  state  for 
A  -♦  Y;  no  other  outputs  are  to  be  emitted. 

This  compiler  satisfies  the  hypotheses  of  the  previous  theorem,  so  there 
is  a  simple  translation  on  T^(G)  which  is  equivalent  to  the  original  simple 
Polish  translation.  The  proof  of  the  preceding  result  tells  us  that  this 
translation  will  be  the  following: 


(X,  cp)  -*  a(X,  epa)  [(X,  epa)) 

(X,  cp)  -»  (Y,  f)(X,  cpY)  I(Y,  f)(X,  cpY)} 
(X,  cpY)  _♦  vX.  CpA)  (y  (X,  cpA)) 

(X,  X)  -*  € 


where  y  is  the  outpv*-  for  A  ■*  Y  in  G. 


Q.E.D 
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We  do  not  wish  too  much  significance  to  be  assigned  to  Theorem  4.29.  It 
Is  an  Interesting  result,  but  not  terribly  Important.  First  of  all,  there 
are  larger  classes  of  simple  translations  cu  G  which  are  expressible  as  simple 
translations  on  T^(G),  but  they  are  not  so  easy  or  natural  to  descvibe.  Fur¬ 
thermore,  the  proof  of  Theorem  4.29  gives  but  one  way  fct  these  translations 
on  G  to  be  expressed  on  TM<G);  there  are  other  ways  in  *nich  some  of  them 
could  be  expressed.  In  particular,  some  simple  translations  on  G  can  be  ex¬ 


pressed  as  simple  Polish  translations  on  1^(0,  if  different  techniques  are 
used  for  constructing  the  compilers  P  and  P'  of  the  proof  of  the  theorem:  and 
some  simple  translations  on  G  cannct  be  expressed  as  translations  on  T^(G). 
But  we  shall  not  dwell  on  this  topic.  It  is  tempting  *-o  get  involved  m  de¬ 
termining  the  precise  relationship  between  translations  expressible  on  G 

and  those  expressible  on  T^CG).  But  such  investigations  would  not  be  germane 

to  the  course  of  our  development.  We  have  already  shown  the  mojor  result  of 

practical  significance,  namely  that  any  useful  translation  expressible  on  G 

can  be  implemented  by  a  compiler  using  an  LL  parser  for  T^(G). 

There  is  one  further  point  that  is  both  interesting  and  important:  that 
there  are  some  simple  translation  grarrmars  based  on  T^(G)  that  cannot  be  ex¬ 
pressed  by  any  simple  translation  on  G,  nor  even  implemented  by  any  compiler 
based  on  the  LR(k)  parser  for  G.  The  intuitive  reason  tor  this  is  that  when 


T  (G)  U3es  the  rule  (X,  cpY)  -*  (X,  cpA),  which  corresponds  to  C  using 

M 

A  -*  Y,  T  (G)  has  seme  extra  information  that  G  doe3  not:  namely,  that  this 
M 

derivation  started  with  an  X.  Two  similar 


rules  t  s ay  (X,  cpY)  -*  (X,  q>A)  and  (Y,  <p*Y)  -*  (Y,  cp*A)  might  have  different., 

unrelated  translation  elements  in  IL/G);  yet  both  rules  correspond  co  the 

n 

LR(k)  machine  reducing  Y  to  A.  This  fact  is  not  really  as  excising  as  it  first 
appears;  the  catch  is,  of  course,  that  to  specify  such  a  trans  a  ion,  the. 
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language  designer  would  have  to  get  his  hands  on  either  the  cycle-free  machine 
M  from  which  T^CG)  is  derived,  or  on  T^(G)  itself,  cut  neither  of  these  occur¬ 
rences  is  espev*.ially  likely  or  desirable;  and  even  furti  or,  the  language  de¬ 
signer  would  then  have  to  express  his  ’context-dependent"  outputs  (for  example, 
that  a  conditional  is  to  be  translated  one  way  if  it  is  part  of  an  expression, 
another  if  it  is  a  statement,  which  is  just  the  kind  of  flexibility  T^fG) 
might  allow)  in  terms  of  the  grannar  T^XG),  which  may  be  unrecognisable  to 
him,  and  which  may  bear  little  visible  relationship  to  his  origiral  grammar  G; 
and  so  he  might  find  it  difficult  to  use  this  new  grammar  to  express  his  no¬ 
tions  of  the  semantics  of  the  language. 

This  concludes  our  discussion  of  the  complex  and  somewhat  murky  area  of 
translations.  However,  we  have  confirmed  the  utility  of  our  transformation  by 
showing  that  is  at  least  as  useful  as  G  is,  in  directing  the  compila¬ 

tion  of  programs  in  L(G). 

4.R  Improving  T^(G) 

We  have  seen  how  to  derive,  given  a  cycle* free  MSP(k)  machine  H  for  G, 
an  LL(k)  grammar  T..(G)  which  generates  L(G).  We  have  seen  that  T  (G)  com- 
pares  favorably  with  G  with  respect  to  compiling  ability,  and  that  the  num¬ 
ber  of  steps  in  an  LL(k)  parse  using  T^(G)  is  not,  in  general,  much  greater 
than  the  number  of  steps  in  an  LR(k)  parse  using  G.  But  we  would  like  to 
know  if  there  is  any  way  to  improve  T„(G)  or  enhance  its  parser,  that  will 
increase  the  parsing  speed  or  have  other  salutary  effects,  without  destroying 
the  translating  power  or  the  LL-nesr.  of  1^(0. 

There  is  one  very  elementary  enhancement  that  can  be  made  to  the  Ll(k) 
parser  for  T^CG),  that  will  f ’cantly  increase  its  parsing  speed.  This 
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is  a  well-known  trick,  described  in  mail/  pieces,  including  [1  ],  page  662. 

In  the  basic  LL(k)  parser  for  T^(G),  application  of  the  rule  (X,  cp)  +  a(X,  cpa) 
takes  place  as  follows:  when  (X,  cp)  is  the  top  stack  symbol,  and  the  look¬ 
ahead  specifics  that  this  rule  is  to  be  applied,  (X,  cp)  1^  removed  from  the 
ttack  and  is  replaced  by  a(X,  cpa);  in  the  next  slap,  the  symbol  a  will  be  on 
top  of  the  stack,  and  perforce  a  will  also  be  the  Hirst  lookahead  symbol;  30 
a  is  removed  from  the  input  stream  and  also  popped  off  the  stack,  exposing  (X, 
cpa).  There  is  no  reason  why  this  cannot  be  effected  in  one  step;  with  (X, 
cp)  at  the  top  of  the  -ack,  and  a  as  the  firs'  symbol  of  the  lookahead  which 
is  dictating  application  of  (X,  cp)  -*  a(X,  »pa),  just  replace  'X,  cp)  hy  (X,  cpa) 
on  the  stack,  and  remove  a  from  the  input  stream. 

This  speeded-up  LL(k)  parser  can  support  any  compilation  activities  that 
the  original  parser  could,  but  it  is  somewhat  faster,  in  particular,  thcve 
will  be  one  less  step  in  the  fast  LL(k)  parse  for  each  symbol  in  the  input 
string  being,  parsed.  So  now  the  fast  LL(k)  parse  for  Y  (G)  begins  to  compare 
favorably  with  the  LK(k)  parse  for  G;  the  number  of  steps  in  the  former  will 
be  equat  to  the  number  of  steps  in  the  latter,  plus  twice  the  number  of  predic¬ 
tions  wad",  minus  the  length  of  the  input  string.  In  most  normal  circumstances 
the  J set  term  will  be  greater  than  the  second  (since  predictions  will  rarely 
be  made  just  to  read  two  input  symbols),  so  the  fast  LL(k)  parse  will  in  prac¬ 
tice  usually  be  faster  than  the  Iit(k)  parse. 

With  reg?  'd  to  the  issue  of  the  parser  for  T^(G),  we  would  also  like  to 
know  if  it  really  needs  to  in  ct  k  symbols  of  lookahead.  That  is,  we  are 
assured  that  T^(G)  Is  LL(k);  bu  here  a  smaller  k*  such  tha~  T^(G)  is 

LlCk*)?  This  question  Is  not  only  of  theoretical  significance,  but  also  im¬ 
pacts  the  size  and  speed  of  the  parser  needed  to  analyze  strings  generated 
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T^(G).  In  general,  however,  this  question  has  a  negative  answer.  We  have 
seen  from  our  earlier  study  that  T  (G)  uses  the  lookahead  to  choose  among 
various  rules  precisely  in  the  way  that  M  uses  the  lookahead  to  ch  ose  among 
various  transitions.  So  if  M  at  any  point  really  needs  to  know  all  k  symbols 
of  the  lookahead,  and  can  not  always  get  by  with  effectively  looking  at  some 
k'- length  prefix  of  this  lookahead,  then  T^G)  will  also  need  to  inspect  full 
k  symbols  of  lookahead  at  some  point,  and  so  will  not  be  LLCk*)  for  any  k*<,  k. 
These  terms  can  be  made  more  formal  and  easily  lead  to  this  result. 

Theorem  4.31  If  M  is  a  cycle-free  MSP(\'  machine  that  uses  k  symbols  of 

input,  then  T  (G)  is  not  LL(k  ),  for  rny  k*<  k. 

M 

This  is  not  to  spy  that  a  parser  for  T  (G)  will  always  need  to  inspect 

M 

all  k  symbols  of  lookahead  in  order  to  determine  its  action,  only  that  in  some 
cases  it  will  need  to.  In  many  cases,  the  parser  will  no  doubt  be  able  to 
get  by  with  less,  and  a  cleverly  de  ignod  parser  will  take  advantage  of  these 
special  cases,  both  to  save  parsing  time  and  to  reduce  the  parser  table  size. 

But  TU(G)  will  not  really  be  LL(k*). 

There  is  one  area  in  which  T„(G)  seems  to  be  grossly  deficient,  and  which 

*  • 

adversely  affects  both  the  sJze  and  speed  of  the  LL(k)  parser  for  T  (G);  and 

n 

that  is  in  the  size  of  T„(G).  The  concept  of  Che  size  of  a  grammar  is  not 
‘  n 

precisely  defined,  but  it  is  clearly  related  to  the  rumber  of  nonterminals  and 
the  number  of  rules;  ar.d  in  both  of  these  aieas,  T  (G)  J.s  greatly  inflated  over 
G.  For  example,  the  granmar  T^(C)  of  Figure  4.5,  which  is  repeated  in  Figure 
4.15,  has  24  rules  and  18  nonterminals;  the  grammar  G  from  which  it  was  de¬ 
rived  had  7  rules  and  3  nonterminals.  This  is  not  just  a  problem  of  aesthe¬ 
tics.  First  of  all,  the  size  of  a  grammar  directly  affects  the  size  of  the 
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parsing  table  needed  by  the  LL(k)  parser  for  the  gramm  \r ;  more  about  this 
soon.  And  furthermore,  all  these  extra  nonterminals  mean  extra  reductions  to 
be  made  and  more  steps  in  the  parse. 


(S,  f)  •*  a(S,  a) 

(S,  a)  -♦  (B,  (S,  aB) 

(S,  a)  -*  d (S ,  ad) 

(S,  a)  -*  c (S,  ac) 

(S,  aB)  -»  (S,  A) 

(S,  ad)  -*  (S,  A) 

(S,  ac)  -♦  (S,  A) 

(S,  A)  -*  x(S,  Ax) 

(S,  Ax)  -»  (S,  S) 

(S,  S)  -»  f 


(B,  g)  -♦  a(B,  a) 

(B,  f)  -»  b(B,  b) 

(B,  a)  -*  c (B,  ac) 

(B,  a)  -♦  d(B ,  ad) 

(B,  a)  (B,  O  (B,  aB) 
(B.  b)  -*  <B,  B) 

(B,  ac)  -♦  (B,  A) 

(B,  ad)  -♦  (B,  A) 


Figure  4. ?5 


(B,  aB)  -♦  (B,  A) 

(B,  A)  -*  x(B,  Ax) 
(B,  Ax)  -♦  (B,  B) 

(B,  Ax)  -»  y (B,  Axy) 
(B,  Axy)  -»  (B,  B) 
(B.  B)  -+  £ 


It  is  Impossible  to  precisely  characterise  the  size  of  T^G)  (whether  in 
terms  of  rules  or  nonterminals)  purely  in  terms  of  the  size  of  G,  since  the 
nature  of  T^(G)  depends  mainly  on  the  structure  of  the  cycle-free  MSP(k) 
machine  M  from  which  it  is  derived.  It  is  possible  to  contrive  some  very 
gross  upper  and  lower  bounds  for  the  size  of  T^G),  but  they  art  practically 
useless.  The  problem  is  that  while  any  useful  cycle-free  machine  will  have 
a  reasonable,  describable  structure,  some  cycle-free  machines  will  not,  and 
they  are  the  ones  that  determine  the  bounds.  We  can  get  some  crude  feeling 
for  the  size  of  T.,(G)  by  inspecting  the  machine  M,  without  computing  the 
whole  granmar.  For  example,  we  know  that  there  will  be  at  least  one  non- 
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terminal  for  each  state  of  M.  Also,  for  each  predictive  state  in  M,  some 
number  of  states  will  receive  an  additional  name.  At  the  very  worst,  the 
number  of  nonterminals  in  T^(G)  will  be  proportional  to  the  square  of  the 
number  of  states  of  M  (and  even  this  will  have  a  constant  factor  less  than 
one).  7ae  number  of  rules  can  be  similarly  grossly  bounded  by  the  number 
of  nonterminals  in  T^(G)  times  the  size  of  the  vocabulary  of  G  (terminal  and 
nonterminal).  However  these  bounds  are  not  attainable,  and  are  not  even 
approached  by  any  reasonable  grammar  and  machine.  But  in  terms  of  simple 
structural  properties  of  M,  it  is  difficult  to  obtain  precise  closed  ex¬ 
pressions  for  the  size  of  T^(G),  for  i  .  is  possible  for  M  have  a  very 
baroque  structure,  causing  a  peculiarly  shaped  and  sized  derived  grammar.  We 
shall  not  try  to  characterize  the  precise  degree  of  TM(G)'s  deficiency;  we 
shall  resume  that  in  general  it  will  be  unsatisfactorily  large,  and  shall 
concentrate  on  the  problem  of  making  it  smaller. 

It  is  not  immediately  obvious  what  the  appropriate  parameters  are,  and 
what  their  respective  weighcs,  when  it  comes  to  reducing  the  3ize  of  a  gram¬ 
mar.  It  is  possible  to  make  arguments  for  each  of  the  following  character¬ 
istics  of  a  grapKuar,  as  being  of  great  significance  in  measuring  its  size; 
the  number  of  rules;  the  number  of  nonterminals;  the  length  of  the  longest 
rule;  the  sum  of  the  lengths  of  the  rules;  the  value  of  the  lookahead  which 
is  needed  to  parse  the  grammar.  Which  of  these  should  really  be  considered 
foremost  depends  in  great  measure  on  the  reason  for  trying  to  reduce  the 
graninar' s  size.  In  our  case,  shall  more  or  lees  arbitrarily  select  one 

particularly  goal  for  reducing  T  (G).  We  are  not  going  to  try  t  ttisfy 

M 

our  aesthetic  sensibilities,  or  minimize  some  complexity-oriented  measure 
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of  a  grammar;  nor  even  attempt  to  make  the  grammar  core  "manageable*'  in  some 
vague  way.  We  are  interested  in  reducing  the  size  of  the  parsing  table  that 
will  be  constructed  from  the  grammar  and  that  will  be  used  in  a  compiler  for 
the  language.  This  reduction  may  come  at  the  expense  of  increasing  the 
maximum  size  of  the  stack  that  can  occur  durirg  a  parse;  this  latter  figure 
is  related  to  the  lengths  of  the  rules. 

Thus  we  must  consider  what  kind  of  parser  we  will  be  constructing,  and 
what  factors  influence  its  size.  The  whole  thrust  of  our  work  has  been  to 
create  LL  grammars,  so  we  shall  consider  the  kind  of  parsing  table  for  LL 
gramcars  described  by  several  Tuthors  [  1  ,13,20].  Basically,  for  a  given 
strong  LL(k)  granmar  G,  the  table  looks  as  follows:  the  rows  are  labelled 
with  the  nonterminals  of  G,  the  columns  with  all  strings  in  VT  .  This  table 
will  fca  used  by  the  parser  in  the  expected  way:  if  the  topmost  element  of 
the  stack  is  A,  and  the  lookahead  is  u,  then  the  (A,  ui)-entry  of  the  table  is 
looked  up,  and  replaces  A  on  the  stack.  (The  other  function  of  the  parser  is: 
if  the  terminal  symbol  a  is  on  top  of  the  stack  and  also  the  first  symbo" 
of  the  input,  then  it  is  removed  from  both  places.)  What  is  the  size  of  such 
a  table?  The  number  of  entries  is  |  j  •  |  V^|  ;  an  (A,w)  entry  itself  will 
be  the  right-hand  side  of  some  A-rule.  We  can  keep  these  right-hand  sines 

in  some  list  and  make  the  (A,  io)  entry  of  the  table  point  to  the  appropriate 

list  entry.  Thus  the  amount  of  storage  needed  is  something  like  j  J  •  j  V^,  |  + 

ine  sum  of  the  lengths  of  the  rules  of  G. 

So  much  for  theory.  In  practice,  however,  the  issues  ar  not  so  sharply 

drawn.  First  of  all,  tha  second  term  is  frequently  ignored,  largely  because 

it  is  not  clear  how  to  get  a  handle  on  that  parameter  of  a  grammar  or  how  to 
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go  about  making  it  significantly  smaller.  It  is  generally  felt  that  there  is 
some  kind  of  tr-^deoff  between  the  number  of  rulet.  of  a  granmar  and  theit  indi¬ 
vidual  lengths,  so  that  the  sum  of  the  lengths  will  net  vary  greatly  over  gram¬ 
mars  that  are  obtained  from  each  other  by  minor  modifications.  It  would  sean 
then  that  the  value  of  k,  which  effects  the  size  of  jv^jk,  *-s  the  dominant 
factor  in  determining  parser  size.  But  this  is  not  strictly  urue.  In  prac¬ 
tice,  it  uy  well  be  worthwhile  to  increase  slightly  the  value  of  k  if  that 
enables  us  to  eliminate  many  of  the  nonterminals  of  the  grammar.  The  reason 
for  this  is  that  parsing  tables  are  not  organized  quite  so  naively  as  described 
above;  frequently  those  are  far  fewer  than  |  V  entries  for  a  given  row. 

For  example  if,  as  often  happens,  all  strings  with  T  as  a  prefix,  where  t^kk, 
have  the  same  entry  for  a  given  row,  there  won't  be  entries  for  each  of  these  strings 
for  that  row,  but  just  one,  labelled  by  T.  Furthermore,  this  parsing  table 
is  usually  very  sparse;  i.e. ,  many  of  its  entries  are  "ERROR,"  indicating 
that  with  that  nonterminal  on  top  of  the  stack,  the  given  lookahead  can  not  oc¬ 
cur  during  a  legal  parse.  At  the  expense  of  delaying  the  time  at  which  such 
an  error  is  located,  it  is  possible  to  eliminate  many  of  these  entries.  These 
remarks  are  jusc  intended  to  convey  a  feeling  for  ”cs:c  of  the  ways  of  escaping 
from  the  tyrrany  of  the  exponential  in  the  table  size.  For  similar  discussions 
for  LR(k)  parsfng,  see  [  4  J,  page  80,  and  [  1  ],  Chapter  6. 

Before  we  go  much  further  in  deciding  which  parameters  of  T  (0)  we  are  go- 

M 

*ng  to  t*y  to  minimize,  we  should  determine  what  kind  of  tools  we  ure  going 
to  use  to  attain  our  goals.  Since  we  want  this  reduction  procedure  to  be 
readily  applicable,  we  do  not  wish  to  get  involved  in  very  recondite  further 
transformations;  and  since  we  want  to  maintain  the  LL-ness  of  the  grammar, 
and  preserve  the  class  of  translations  which  it  can  direct,  we  must  not  use 
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any  transformations  which  do  grave  violence  to  the  grammar's  structure.  We 

ha<e  iecided  that  the  simplest  and  most  appropriate  reduction  techi.ique  is 

that  of  eliminating  nonterminals  by  substitution,  as  described  below. 

The  choice  of  this  technique  reir forces  our  choice  of  which  grammatical 

parameter  to  foci  i.  on;  for  while  it  is  natural  to  decrease  the  number  of 

symbols  and  w.lec  i.i  a  grammar  by  repeated  substitutions,  it  is  not  possible 

to  reduce  the  value  of  the  lookahead  the  grammar  needs,  merely  by  substituting 

out  some  nonterminals.  So  the  only  significant  factor  over  which  we  can 

really  exercise  same  control  is  the  number  of  nonterminals  in  the  grammar. 

We  shall  take  it  as  our  mission  to  minimize  by  substituti  in,  the  number  of 

nonterminals  in  T  (G)  v  while  keeping  an  eye  on  the  effect  this  process  has  on 
M 

the  'alue  of  k. 

We  also  observe  that  if  G„  is  obtained  from  G.  by  substituting  out  some 

•i 

nonterminals,  then  the  G2”tree  for  a  string  y  will  be  3  compact  version  of  the 
Gj-tree  for  w;  so  a  G2-parser  will  go  through  fewer  configurations  in  recog¬ 
nizing  w  than  a  similar  Gj-parser,  and  so  '/ill  be  faster. 

Definition  4.32  Let  G  be  ramruar,  A  a  nonterminal  of  G.  Let  the  A-rules 

of  G  be  A  -*  x.  .  A  -*  x„.  ....  A  -*  x  .  Then  we  eliminate  A  from  G  by  substitution  bv 
12  n 

removing  A  from  V^,  removing  all  the  A-rules  from  the  list  of  rules ,  and  replacing  ever 
rule  of  the  form  B  ♦  aAP  by  the  set  of  rules  3  -*  ax^p ,  B  -*  crx^P »  •  •  • »  B  -*  ax^p. 

It  is  not  always  possible  to  eliminate  a  nonterminal  from  a  grammar  in 
this  way. 

Lemma  4.33  If  \  does  not  occur  on  the  right-hand  side  of  any  A-rule,  then 
A  can  be  eliminated  by  substitution;  the  result  is  a  grammar  with  one  less 


nonterminal. 
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Proof  Immediate  from  the  definition.  Q.E.O. 

We  are  interested  in  eliminating  more  than  one  nonterminal  from  the 
grammar  T^(G).  Rather  than  try  to  define  the  concept  of  eliminating  several 
nonterminals  at  the  same  time,  we  shall  say  that  this  is  done  by  eliminating 
one  nonterminal  from  T^(G),  eliminating  another  from  the  resultant  grammar,  and 
so  on.  We  would  like  to  know  how  many  (and  which)  of  the  nonterminals  of 
T^(G)  can  be  ultimately  disposed  of  by  repeated  applications  of  the  substitu¬ 
tion  process.  Sometimes  even  though  each  member  of  a  set  of  nonterminals 
could  be  eliminated  as  an  individual,  it  is  not  possible  to  eliminate  every 
one  in  the  set. 


Definition  4.34  Let  A. ,  ....  A  be  a  sequence  of  distinct  nonterminals  such 
_________  i  n 

that  A  j  appears  on  the  right-hand  side  of  an  A^-rule,  A^  appears  on  the 
right-hand  side  of  an  A^-rule,  and  A^  does  not  appear  on  the  right-hand  side  of 
any  other  A^  rules.  Then  A^,  ...,  A^  is  a  recursive  sequence  of  nonterminals. 

It  is  possible  for  any  single  nonterminal  to  appear  in  several  ecursive 
sequences.  We  are  not  interested  in  the  ordering  of  the  nonterminals  in  a 
recursive  sequence;  we  shall  call  a  set  of  nonterminals  a  recursive  sequence 
if  there  is  some  ordering  of  the  nonterminals  whichmakes  them  into  a  recursive 
sequence. 


Lemma  4.35  If  A^ . Ar  is  a  recursive  sequence  of  nonterminals  of  G,  then 

not  all  of  Aj,  ...,  An  can  be  eliminated  from  G. 

Proof  After  eliminating  all  nonterminals  except  A^ ,  there  will  be  some  A^  rule 


with  Aj  on  its  right-hand  side;  then  A^  cannot  be  eliminated. 


Q.E.D. 
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The  following  results  are  trivial. 

Lemma  4,36  Suppose  A^,  is  not  a  recursive  sequence.  Then  all  of 

A^,  ...,  An  can  be  eliminated  from  G,  and  the  elimination  can  be  done  in 
any  order. 

Lemma  4.37  If  Q*  is  obtained  from  G  by  eliminating  a  nonterminal  of  G  by 
substitution. then  L(G)  =  LCG*). 

In  other  words,  for  every  different  recursive  sequence  of  nonterminals, 
at  least  one  of  the  nonterminals  cannot  be  removed  from  the  grananar.  How¬ 
ever,  any  nonterminal  not  in  any  such  sequence  can  always  be  substituted  out. 

Hi  is  leads  us  to  formulate  the  following  tentative  plan  for  removing  as  many 
nonterminals  as  possible  from  an  arbitrary  granmar  G:  1)  find  all  distinct 
recursive  sequences  of  C;  2)  find  a  minimum  set  of  nonterminals  such  that 
there  is  at  least  one  member  of  the  set  in  each  recursive  sequence;  3)  elimi¬ 
nate  all  nonterminals  other  than  those  in  this  minimum  set,  in  any  order. 

It  is  not  hard  to  show  that  this  procedure  does  work,  and  eliminates  a 
maximal  number  of  nonterminals  from  the  granmar.  However,  this  process  is 
not  immediately  applicable  to  our  situation,  for  we  are  trying  to  minimize 
the  nontarminals  of  T^(G)  subject  to  the  constraint  of  keeping  the  grammar  LL. 

We  do  not  insist  on  not  increasing  the  value  of  k,  hut  we  do  need  the  grammar 
to  remain  LL(k,>  for  some  k  \  Let  us  see  what  restrictions  this  places  on 

which  nonterminals  can  be  eliminated  from  T.,(G). 

n 

There  are  four  kinds  of  rules  in  T^(G):  ( X,  ®)  a(X,  ;pa),  (X,  cp)  -*  (Y,  £) 
(X,  cpY),  (X,  epa)  -♦  (X,  cpA),  and  (X,  X)  -»  f.  Looking  at  it  from  the  LL(k),  look¬ 
ahead  point  of  view,  the  elimination  of  different  nonterminals  might  have 
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different  effects.  For  example,  if  we  replaced  the  instance  of  (x,  9a)  in 
(X,  9)  -♦  a(X,  9a)  by  all  the  right-hand  sides  which  (X,  9a)  goes  to,  we 
would  at  wor.it  increase  the  lookahead  for  the  grammar  by  one.  That  is,  since 
Tm(G)  is  strong  LL(k),  if  (X,  ^a)  -♦  and  (x,  9a)  -*  R2,  then  and  R^  can 
be  distinguished  by  k-lookahead;  hence,  aR^  end  aR2  can  be  distinguished  by 
k  +  1  lookahead.  Similarly,  replacing  (X,  tpA)  in  (X,9Ct)  -♦  (X,  9A),  by  all 
its  right-hand  sides,  does  not  affect  the  lookahead  value  of  the  gramaar 
at  all.  However,  eliminating  (X,  9Y)  from  (X,  9)  -♦  (Y,  f)  (X,  9Y)  could 
completely  destroy  the  Li.  uess  of  the  grammar,  if  (Y,  £)  generates  an  infinite 
set  of  strings.  In  that  case,  if  the  rules  for  (X,  9Y)  were  (X,  9Y)  -♦  R^  artd 
(X,  pY)  -»  R2,  we  would  get  CX,  9)  -♦  (Y,  c)Ri  and  (X ,  9)  -♦  (Y,  ,=  )R2;  since 
(Y,  f)  can  generate  arbitrarily  long  strings  we  would  not,  in  general,  be 
able  to  "see"  past  (Y,  £),  to  examine  the  rest  of  the  lookahead  and  distinguish 
(Y,  e)Rx  from  (X,  fc)X2* 

This  completes  the  enumeration  of  all  ways  that  nonterminals  can  be  used 
on  the  right-hand  side  of  a  rule;  and  we  have  seen  that  the  only  usage  that 
can  give  us  trouble  if  eliminated  is  (X,  9Y)  in  (X,  9)  -*  (Y,  p)(X,  9Y),  if 
(Y,  f)  generates  an  infinite  set  of  strings.  All  other  instances  of  non¬ 
terminals  can  be  substituted  out,  without  fear  of  doing  violence  to  the  LL- 
ness  of  the  granmar;  at  worst,  eliminating  (X,  9a)  from  (X,  9)  -*  a(X  9?-)  can 
increase  the  value  of  k  by  one.  But  since  we  do  not  eliminate  individual 
"iisfanc. s"  of  nonterminals,  hut  expunge  the  nonterminal  completely,  wc  get 
the  following  tentative  summation.  The  elimination  of  a  nonterminal  from 
Tm(G)  destroys  t5  e  LL-ness  of  the  grammar  if  and  only  if:  the  nonterminal  is 
(X,  9Y);  it  appears  in  (X,  9)  -♦  (Y,  £)  (X,  cpY) ;  and  (Y,  $)  generates  an 
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infinite  set  of  strings. 

There  is,  however,  one  codicil  which  must  be  added  to  this  statement. 

There  is  one  case  where  (x,  q>Y)  can  be  removed  from  the  grammar  even  though 
it  satisfies  all  of  these  conditions;  and  that  is  when  there  is  exactly 
one  rule  with  (X,  cpY)  as  its  left-hand  side.  In  such  a  case,  no  confusion  can 
result  from  eliminating  (X,  cpY) ;  and  then  instead  of  having  (X. ,  cp)  -»  (Y,  g) 

(X,  cpY)  and  (X,  c pY)  -»  R,  we  get  (X,  9)  -»  (Y,  OR-  Now  the  same  comments 
that  applied  to  (X,  cpY)  will  apply  to  the  first  nonterminal  of  R;  if  there 
is  only  one  rule  with  It  on  the  left-hand  side,  it  can  be  eliminated  from 
the  grananar,  but  otherwise  it  cannot  be.  If  there  is  a  sequence  of  such 
nonterminals,  then  all  of  the  nonterminals  in  the  sequence  can  be  eliminated. 

To  make  this  precise,  we  introduce  the  following  definition. 

Definition  4.38  A  string  of  symbols  A^,  A„,. ..,  is  a  chain  in  G  if  for  each 
i,  1  S  i  <  n,  is  a  nonterminal  that  appears  on  the  left-hand  side  of 
exactly  one  rule,  where  is  the  first  nonterminal  on  the  right-hand  side 

of  the  A  -rule,  and  A  is  c,  if  A  ,  -♦  f  io  a  rule,  or  otherwise  a  nonterminal 
that  arpears  on  more  than  one  left-hand  side.  We  say  that  A^  is  the  head  ol 
the  chain. 

If  CX,  cp)  -*  (Y,  f)  (X ,  jpY)  is  a  rule,  and  A^,  ...,  A^  is  a  chain  headed 
by  (X,  cpY),  then  we  can  eliminate  all  nonterminals  In  the  chain  if  A^  =  g; 
otherwise,  we  can  eliminate  all  but  one. 

We  can  combine  this  new  condition  with  our  previously  described  method 
for  eliminating  the  maximum  number  of  nonterminals  frem  a  grammar.  We  now 
have  what  amounts  to  an  additional  contrai.it,  that  no  chain  headed  by  such  an 
(X,  qjY)  can  be  wholly  eliminated.  We  can  identify  each  of  tb^se  (X,  cpY)  nonterminals. 
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and  compute  the  chain  which  it  heads;  we  then  must  find  a  minimal  set  of 
nonterminals  which  contains  a  representative  from  each  such  chain  as  well  as 
from  each  recursive  sequence.  This  minimization  can  be  addressed  by  con¬ 
structing  a  Boolean  expression  in  conjunctive  normal  forts,  one  term  for  each 
recursive  sequence  and  for  each  appropriate  chain;  a  term  will  be  the  dis¬ 
junction  of  the  symbols  in  the  s  t,.  We  then  must  find  the  minimal  cover 
of  this  expression,  the  smallest  (fewest  literals)  expression  A  A  B  AC 
which  logically  implies  the  original  expression.  There  are  standard  methods 
for  computing  such  a  cover.  We  note  that  there  may  be  several  different 
minimal  covers  of  the  same  size;  presently  we  shall  give  some  criteria  for 
choosing  among  them. 

We  now  present  a  technique  for  making  this  grammatical  minimization  pro¬ 
cess  a  little  more  structured  and  manageable. 

Definition  4.39  For  a  given  grammar  1^(0),  the  nonterminal  graph  of  1^(0  is 
a  directed  graph  constructed  as  follows:  the  nodes  of  the  graph  are  the  non¬ 
terminals  of  T^/G)  and  the  symbol  and  there  is  an  edge  from  nonterminal 
to  symbol  A^  if  and  only  if  A^  appears  on  the  right-hand  3ide  of  a  rule  for  A^. 

For  example,  the  graph  for  TLAG)  of  Figure  4.15  is  given  in  Figure  4.16. 
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Figure  4.15 


We  can  use  this  nonterminal  graph  for  a  variety  of  purposes.  First  of 

all,  to  determine  those  (Y,  f)  nonterminals  which  generate  Infinite  strings. 

It  is  obvious  that  if  A  generates  an  infinite  set  of  strings,  there  must  be 

*  * 

some  nonterminal  B  such  that  A  =*  xBy  and  B  =»  uBv.  but  such  a  recursive  non  ¬ 
terminal  B  would  occur  in  a  cycle  in  the  nonterminal  graph,  and  can  thus  be 
detected.  Thus  (Y,  f)  generates  an  infinite  set  if  and  only  if  there  is  some 
nonterminal  (X,  <p)  accessible  from  (Y,  g),  such  that  (X,  {p)  is  contained  in 
some  cycle  in  the  graph. 

The  cycles  in  this  graph  also  immediately  give  us  all  recursive  sequences 
in  T„(G),  since  a  set  of  nonterminals  form  a  recursive  sequence  if  and  only  if  they 
form  a  cycle  in  the  graph.  So  our  plan  should  be  to  first  locate  all  cycles  in 
this  graph.  This  immediately  gives  as  all  the  recursive  sequences  of 
Tm(G),  and  enables  us  to  identify  chose  nonterminals  of  the  form 
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(Y,  g)  which  generate  infinite  sets.  Once  we  have  determined  these  (Y,  £), 
we  are  in  a  position  to  p'  k  some  chains  which  cannot  be  eliminated  from 
the  granr.ar:  namely,  those  headed  by  nonterminals  of  the  form  (X,  cpY). 

In  this  example,  there  is  only  one  cycle  in  the  gra^h,  consisting  of 
the  nonterminals  (B,  g)  and  (B,  a).  So  at  least  ont  of  these  must  not  be 
eliminated  from  the  grammar.  Furthermore,  this  means  tnat  (B,  g)  generates 
an  infinite  set  of  strings.  Hence  we  must  consider  the  nonterminals  (S,  a3) 
and  (B,  aB).  The  first  of  these  heads  the  chain  (S,  aB),  (S,  A),  (S,  Ax) , 

(S,  S),  c;  since  £  is  in  the  chain,  none  of  the  nonterminals  in  it  need 
remain  in  the  grammar.  However,  either  (B,  aB)  or  (B,  A)  or  (B,  Ax)  must 
remain  in  the  grammar,  since  (B,  aB),  (B,  A),  (B,  Ax)  forms  a  chain.  Hence 
the  logical  expression  describing  which  nonterminals  rust  be  retained  in  the 
grammar,  is  ( (S,  r))  A  ((B,  £)  v  (B,  a))  A  ((B,  aB)  v  (B,  A)  V  (B,  Ax)). 

There  are  a  variety  of  minimal  covers  for  this  expression,  but  each  of  l  hem 
contains  three  nonterminals.  So  15  of  the  18  nonterminals  of  this  grasmnar 
can  be  eliminated,  and  still  leave  the  result  LL.  Two  possible  covers  for  this 
expression  are  (S,  £)  A  (B,  f)  A  (B,  aB)  and  (S,  £)  A  (B,  a)  A  (B,  A),  The 
evo  granmars  corresponding  to  these  choices  are  given  in  Figure  4.17. 
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(S,  f)  -♦  a(B,g  )* 

(S,  *  acx 

(S,  g)  -♦  adx 
(B,  g)  -*  acx 
(B,  g)  -»  r.cxy 
(B,  O  .*  *1* 

(B,  O  -*  sdxy 

(B,  C)  -*  a(B,  ?)(B,  aB) 

(B,  O  -♦  b 

(B,  aB)  .*  x 

(B,  aB)  -»  xy 

Flgure  4. 17 


(S,  £)  -»  acx 

(S,  £)  -»  adx 

(S,  g)  -»  abx 

(S,  ^)  -♦  aa(B,  a)x 

(B,  a)  .*  c(B,  A) 

(B,  a)  -♦  d(B,  A) 

(B,  a)  -♦  b(B,  a) 

(B,  a)  -♦  a(B,  a)(B,  A) 

(B,  A)  -»  x 

(B,  A)  -*  xy 


Let  us  create  a  little  terminology  to  make  It  easier  to  talk  about  these 
new  grammars. 

Definition  4.40  Let  (G)  be  a  grammar  derived  from  a  cycle- free  MSP(k) 

machine  M  i  G.  Then  a  minimizing  term  of  TM(G)  Is  defined  as  follows: 

1)  if  A. ,  A_,  ... ,  A  is  a  recursive  sequence  of  nonterminals  of  TW(G),  then 
1  l  n  M 

(A.  V  A  V  ...  y  A  )  is  a  minimizing  term;  2>  if  (Y,  g)  generates  an  infinite 
n 

set  of  strings,  and  if  ( X,  cp)  -♦  (Y,  g)  (X,  cpY)  is  a  rule  of  1^(0),  and  if 
(X,  q*Y),  A  42,  . ...  A  is  a  chain,  then  ((X,  q>Y)  v  V  Aj  V  V  A^)  is 

a  minimizing  term.  If  X^,  X2,  ...,  X^  are  all  the  minimizing  terms  fcr  T^(G) 
then  Xj  AXj  A  ...  A  is  the  minimizing  expression  for  T^(G).  If  (A^,  A^, 
...,  A  }  i  a  set  of  nonterminals  of  T  (G),  then  it  is  a  minimal  cover  of  the 
minimizing  expression  if  A  A  ...  ^  X^  A  X2  A  . . .  X^,  and  if  there  is 
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no  smaller  set  { B, ,  ....  B.)  such  that  B.  A  B.,  A  ...  A  B_  =»  X,  A  X0  A  A 

1’  P  12  P  1  2 

X  .  (Here,  =»  is  used  in  the  logical  sense  of  implication.) 
m 

Definition  4.41  Let  TjJ.(G)  be  a  grammar  obtained  by  eliminating  some  non¬ 
terminals  of  T^(G)  by  substitution.  If  the  nonterminals  of  T^(G)  form  a  mini¬ 
mal  cover  of  the  minimizing  expression  of  T^(G),  then  T^(G)  is  called  a  mini¬ 
mal  version  of  T  (G). 

rf 

These  definitions  merely  formalize  our  preceding  discussion. 

Lenina  4.42  Let  G  be  a  strong  LL(k)  gramnar,  G *  the  grc-mar  obtained  by  elimin¬ 
ating  the  nonterminal  <  from  G  by  substitution.  Then  G  is  strong  LL(k)  for 
some  2  k,  unless  both  of  the  following  conditions  hold: 

i)  There  is  a  rule  of  G  of  the  form  B  -♦  aAP,  where  a  generates  an 
infinite  set  o':  strings. 

ii)  There  is  more  than  one  A-rule. 

Proof  Suppose  that  G *  is  not  strong  LL(k#)  for  any  k*.  Then  for  some  B  and 
terminal  string  u,  B  ^  w  beginning  with  rule  p.  and  B  f  u  beginning  with  rule 
p2,  in  G'.  But  if  p^  is  B  -»  and  p,,  is  B  ■»  T^,  and  if  p^^  and  p2  were  both 
rales  of  G,  then  FIRST^O^  FOLLOW^  (B))  fl  FIRSTk  (*2  FOLLOWk  (B))  0,  since 

G  was  strong  LL(k).  Therefore  at  .least  one  of  the  rules  must  be  new  to  G  . 

That  is,  pj  is  B  -*  acp^P,  4iere  L  -*  aAp  and  A  -♦  ^  were  rules  of  G.  But  then 
FIRSTk  (dp.R)  c  FIRSl^  (C(A  P) ;  yet  FIRST^  (OA  P  FOLLOW^ B))  fl  FIRSTk  (<p2 
FOLLOWk  (B))  =  0  if  pj  is  a  ule  of  G.  Therefore  p2  is  a  new  rule  of  g'  as  well, 
of  the  form  B  -♦  where  B  yAn  and  A  -»  cp2  were  rules  of  G.  But  FIRSTk 

(rfpf  )  C  FIRSTk(CAP)  and  FIRST^y^Ti)  C  FIRSTk<yA*n) ,  while  FiRST^aAp  FOLLOHk 
fB))  fl  FIRST^yAT]  FOLLOWk  (B)>  *«  0  if  B  -*  aAP  and  B  •*  yATj  are  different  rules 
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of  G,  siuce  G  was  strong  LL(k).  Hence  B  -»  aA0  and  B  -»  yjJ]  were  the  same  rule, 
say  B  -*  •  Hence  If  Pj  and  are  different  rules,  then  cp^  ^  <p^ »  so  there 

are  two  different  A-rules.  Now  suppose  that  O.  did  not  generate  an  infinite 
set;  let  the  length  of  the  longest  terminal  string  it  generates  be  k  .  We 
know  that  FIRST^(cp^)  fl  FIRST^C^)  *  0,  since  they  are  botn  right-hand  sides 
of  A-rules;  hence  If  we  set  k*  =  k  +  k^,  we  have  FIRST.  0  FIRST^  '(coq^) 

<o.  Therefore  G'  will  be  strong  LLCk*),  which  contradicts  the  hypothesis. 

Hence  a  generate?  an  Infinite  set,  and  both  required  conditions  hold. 

If  both  conditions  do  hold,  then  B  -*  ctcp^y  and  B  -*  oc^p^y  are  both  rules 
of  g',  where  c*  generates  indefinitely  long  terminal  strings.  Hence  g'  is  not 
strong  LL(k*)  for  any  k Q.E.D. 

Theorem  4.43  If  T^(G)  is  a  minimal  version  of  T^G),  then  L(T^(G))  =  L(T^(G)) 

and  T*(G)  Is  LL(k')  for  some  k*. 

M 

Proof  By  the  definition  of  minimal  version  and  the  preceding  lamias.  Q.E.D. 

be  a  minimal  version  of  T^(G),  and  let  G '  be  obtained 
by  eliminating  some  nonterminal  of  T^(G)  by  substitution.  Then  G'  Is  not 
LL(k“’)  for  any  k*. 

Proof  The  nonterminals  of  l/.(G)  form  a  minimal  cover  of  the  minimizing  ex- 

M 

presslon  of  T^(G).  Since  G*  Is  obtained  by  eliminating  a  nonterminal  of  T^(G) 

Its  nonterminals  do  not  form  such  a  cover.  So  there  Is  some  minirfzing  term 

of  T  (G)  wMch  does  not  contain  any  nonterminal  of  G'.  By  Laima  4.35,  since 
M 

G'  is  well-det 'ned ,  this  term  is  not  formed  from  a  recursive  sequence  of  non¬ 
terminals  of  V  (G).  Hence  this  term  is  that  constructed  for  some  chain  of 

4*1 

TU(G).  Thus  If  the  nonterminal  being  eliminated  from  T*(G)  Is  A,  there  are 


Theorem  4.44  Let 


^(G) 
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rules  B  -»  aAP,  A  -»  cp^,  and  A  -»  In  T^(G),  where  a  generates  an  Infinite 
set  of  strings.  Hence  by  Lenina  4.42,  G*  is  not  LL(k#)  for  any  k*.  Q.B.D. 

Lemma  4.45  Let  G*  be  obtained  by  eliminating  by  substitution  some  nonterminal 
of  G.  Then  any  simple  translation  on  G  has  an  equivalent  simple  translation 
on  G*. 


Proof  The  new  translation  is  constructed  as  fol.ows.  The  rules  of  G  that  are 
also  in  G'  retain  their  translation  elements.  If  B  *♦  CUpP  is  a  new  rule  of  G * , 
c-vated  by  substituting  cp  for  A  in  B  -*  aAP,  consider  the  translation  elements 
for  these  rules:  B  -»  aAJKa'A'P']  and  A  *♦  <p(q>*}.  Then  the  translation  element 
for  B  -»  Oty-p  will  or.  {a'cp'P'J.  It  is  straightforward  that  this  new  translation 


is  simple  and  1.3  equivalent  to  the  original  one. 


Q.  E.  D. 


Theorem  4.46  If  T^(G)  is  a  minimal  version  of  T^VG),  then  any  s'  pie  trans¬ 
lation  on  T^(C)  has  an  equivalent  simple  translation  on  T^j(G). 


Proof  By  preceding  lemma. 


Q.E.D. 


Using  the  same  principle,  we  get: 


Theorem  4.47  Let  P  be  any  compiler  based  on  the  LL(k)  parser  for  T^G).  Then 
there  is  an  equivalent  compiler  based  on  the  LL(k*)  parser  for  T^(G) ,  where 
T^j(G)  is  a  minimal  version  of  Tj^(G)  which  is  LL(k*). 

In  summary  then,  if  T^|(G)  is  a  minimal  version  of  T^(G),  thtn  it  generates 
the  same  language  that  TU(G)  does,  and  it  is  also  LL(k#)  for  some  k*;  it  also 
supports  the  aame  compilations  that  I^(G)  docs.  Furthermore,  it  really  is 
minimal  in  the  sense  that  any  further  nonterminal  elimination  is  either  impos¬ 
sible  or  destroys  the  granmar's  LL-ness.  We  can  compute  all  minimal  versions 
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of  Tm<G)  by  constructing  the  nonterminal  graph  for  T^G),  using  the  graph  to 
find  the  minimizing  expression,  and  then  finding  all  minimal  covers  for  this 
expression.  Each  minimal  cover  gives  rise  to  a  different  minimal  version, 
with  nouterminals  being  those  of  the  cover. 

How  shall  we  select  among  the  various  different  minimal  versions  of  a 
particular  TM(G)?  Let  us  consider  the  two  different  minimal  versions'  of 
Figure  4.17.  They  each  have  three  nonterminals,  the  first  has  eleven  pro¬ 
ductions,  the  second  has  ten.  But  m03t  important,  the  first  grammar  is  only 
LL(4)  (because  of  the  rules  (B,€)  -♦  acx  and  (B,£)  -»  acxy),  while  the  second 
is  LL(2).  By  any  reasonab;e  standards  then,  the  second  is  far  superior.  Is 
there  any  way  we  could  have  been  guided  to  select  that  grammar,  short  of  con¬ 
structing  all  possible  minimal  versions  of  T^G)  and  then  choosing  the  one 
with  the  smallest  lookahead  value? 

Let  us  first  recall  how  the  lookahead  value  of  a  grammar  can  increase 
during  the  process  of  eliminating  nonterminals.  As  we  have  seen,  this  value 
can  be  increased  by  one  when  two  possible  right-hand  sides  for  (X,  cpa)  replace 
it  in  (x,  cp)  a(X,  <pa),  resulting  in  (X,  cp)  -*  ab(X,  cpab)  and  (X,  q.)  -*  ac(X, 
cpac).  Repeated  substitutions  like  this  result  in  rules  like  (X,  cp)  ^  abc.dT^ 
and  (X,  cp)  -*  abcd^*  increasing  the  value  of  the  needed  lookahead  even  fur¬ 
ther.  We  can  anticipate  situations  like  this  by  examining  the  nonterminal  graph 
for  T^G).  A  path  through  this  graph  is  a  sequence  of  vertices,  such  that 
there  is  an  edge  connecting  each  element  of  the  sequence  to  the  next  one.  It 
is  easy  to  see  that  if  p  is  a  path  in  the  graph  connecting  one  element  of 
the  chosen  minimal  cover  to  another  element,  or  to  the  vertex  for  £,  with  no 
intervening  vertices  from  the  cover  in  the  path,  then  there  will  be  a  rule  in 
the  minimal  version  of  the  grammar  derived  from  p;  in  particular,  if  p  con- 
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nects  (X,  9^)  to  (X,  92),  the  rule  will  be  (X,  9^)  *♦  Y(X,  9,,),  where  the 
nature  of  Y  depends  on  the  vertices  the  path  passes  through.  Consequently, 
if  there  are  paths  from  (X,  9^)  to  (X,92)  and  to  (X,  q,^)  which  are  coincident 
for  their  first  parts,  we  will  have  rules  of  this  form  (X,  9^)  .»  YY^(X,  9,,)  and 
(X,  -♦  YY2(X,  9^)-  For  example,  consider  the  graph  of  Figure  4.16.  If 

the  nonterminals  of  the  minimum  cover  are  (S,  g),  (B,  g)  and  (B,  aB),  then 

there  are  the  following  paths  through  the  graph:  (B,  g),  (B,  a),  (B,  ac), 

(B,  A),  (B,  Ax),  B)r  g;  and  (B,  g),  (B,  a),  (D,  ac),  (B,  A),  (B,  Ax), 

(B,  Axy),  (B,  B),  g.  These  paths  are  identical  for  a  while,  and  give  rise 
to  rules  (B,  g)  -*  acx  and  (B,  g)  -♦  acxy,  wfiich  make  the  grammar  LL(4).  On 
the  other  hand,  if  the  nonterminals  are  (S,  g),  (B,  a),  and  (B,  A),  the  above  two 
sequences  are  not  paths,  since  the  nonterminal  (B,  A)  i3  in  the  interior  of  each  of  thim. 

The  significance  of  all  this  is  that  we  should  be  sensitive  to  this  issue 
and  attempt  to  prevent  the  existence  of  such  paths  through  the  graph.  This  can 
be  done  by  trying  to  choose  the  nonterminals  of  the  minimal  cover  sc  that 
there  is  a  nonterminal  closely  "in  front  of"  each  branching  in  the  graph.  Then 
it  will  be  Impossible  for  there  to  be  two  long  path3  that  start  at  the  vertex 
for  a  nonterminal  of  the  cover,  that  are  identical  for  a  long,  time,  and  that 
then  diverge  at  a  branching;  impossible,  because  both  these  paths  would  run 

into  the  vertex  of  some  other  nonterminal  of  the  cover  before  they  reached  the 

branching.  Then  the  early  part  of  the  two  sequences  would  be  one  path,  giving 
rise  to  only  e  single  ru1**;  while  the  two  paths  which  do  eventually  diverge 
are  not  coincident  f>r  very  long,  thus  not  increasing  the  lookahead  greatly. 

Admittedly  these  are  on’y  qualitative  arguments,  hut  they  give  us  some 
handle  for  dealing  with  the  r'  >ice  cf  a  minimal  version  of  T{j(G;  ti  r,- 
creases  the  lookahead  valui  the  jast.  We  now  '  ave  some  criteria  for  compar- 


223 


ing  the  attractiveness  of  different  minimal  covers  for  the  same  grammar,  the 
id-  being  to  choose  that  one  whose  nonterminals  are  "closest"  to  branches  In 
the  graph.  In  tha  example  we  have  been  considering,  this  approach  would 
have  correctly  led  us  to  choose  the  minimal  version  with  nonterminals  (B,  £), 
(B,  a),  and  (B,  A),  as  opposed  to  other  possibilities. 

We  can  consider  extending  this  approach,  to  creating  a  quasi-minimal 

version  of  T„(G),  that  contains  more  nonterminals  than  are  strictly  necessary 

for  a  minimal  version,  but  which  are  carefully  chosen  so  as  to  keep  down  the 

lookaheaa  value  of  the  resulting  grammar.  That  Is,  it  may  be  that  any  truly 

minimal  version  of  T  (G)  has  an  unsatisfactorily  large  lookahead;  but  by 

M 

judiciously  adding  some  extra  nonterminals.  It  may  be  possible  to  reduce  drama¬ 
tically  this  lookahead  value  and  hence  the  size  of  the  parsing  table.  These 
additional  nonterminals  will  be  chosen  to  be  "near"  the  branchings  in  the 
nonterminal  graph  for  T^G).  In  our  example,  if  we  chose  (S,  (0 ,  (B,  g) ,  and 
(B,  Ax)  as  the  minimal  cover,  and  included  the  additional  nonterminal  (S,  a), 
because  it  is  situated  at  a  branching,  W2  would  get  the  grammar  of  Figure  4.18. 
Though  this  grammar  contains  four  nonterminals,  which  is  one  more  than  is 
strictly  necessary  for  a  minimal  version  of  T^(G),  it  is  also  LL(1),  which  is 
better  than  any  really  minimal  version  of  T^(G).  We  mention  this  as  being 
of  possible  practical  interest;  we  do  not  have  any  formal  insight  into  this 
process  of  including  superfluous  nonterminals,  or  a  precise  analysis  of 
when  and  how  it  might  best  be  done. 
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(S,  €)  -*  a(S,  a) 

(S,  a)  -*  cx 
(S,  a)  -*  dx 
(S,  a)  -*  a(B,  a)x 
(B,  a)  -»  cx(B,  Ax) 

(B,  a)  -*  dx(B,  Ax) 

(B,  a)  -»  bx(B,  Ax) 

(B,  a)  ■+  a(B,  a)  (B,  Ax) 
(B,  Ax)  -»  y 
(B,  Ax)  -»  € 

Figure  4.18 
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CHAPTER  5 

CONSTRUCTING  CYCLE-FREE  MACHINES 


5.1  Transformable  Grammars 

We  have  carefully  explored  the  features  of  a  new  transformation  which 
creates,  for  a  non-LL(k)  grammar  G,  an  equivalent  LI,(k)  grammar  T._(G).  We 
have  studied  various  properties  of  the  transformed  grananar,  and  have  al3o  seen 
how  it  may  be  drastically  reduced  in  size  in  order  to  bring  it  to  more  manage¬ 
able  proportions  for  use  in  a  compiler.  But  now  we  mus'..  return  to  an  earlier 
problem,  because  the  point  of  departure  for  this  transformation  is  not  the 
original  grammar  G,  but  rather  a  cycle-free  MSP(k)  machine  for  G.  The 
transformed  grammar  is  even  denoted  T^G),  since  it  is  derived  from  a  particu¬ 
lar  cycle-free  machine  for  G,  M.  This  state  of  affairs  immediately  raises  a 
number  of  important  questions.  How  do  we  find  a  cycle  -free  MSP(k)  machine 
for  a  given  grammar  G?  How  do  we  know  whether  or  not  there  even  exists  such 
a  machine?  And  if  there  are  several  cycle-free  MSP(k)  machines  for  G  (and 
there  often  will  be),  what  criteria  should  we  use  for  determining  which  of 
these  machines  is  most  appropriate  for  application  of  the  transformation 
procedure?  Even  further,  could  we  be  directed  in  our  construction  of  cycle- 
free  MSP(k)  machines  so  that  only  the  "best"  such  machine  is  constructed?  It 
is  to  issues  such  •»s  these  that  we  turn  our  attention  in  this  chapter. 

Definition  5.1  A  grammar  G  is  k- transformable  if  there  exists  a  cycle-free 
MSP(k)  machine  for  G. 
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This  definition  enables  us  to  couch  our  discussions  in  terms  of  vcammars 
rather  than  in  terms  of  particular  machines  for  those  gramnars.  A  J.-trans form- 
able  grammar  is  suitable  for  application  of  the  transformation  process  -  we 
can  try  to  find  a  cycle- free  machine  for  it  and  then  read  a  derived  grammar 
off  the  machine.  We  shall  try  to  get  some  feeling  for  this  class  of  grammars. 

Theorem  $.2  For  an  LR(k)  grammar  G,  there  are  only  finitely  many  MSP(k) 
machines  for  G. 

Proof  Since  there  are  only  finitely  many  LR(k)-items  over  G,  there  are  only 
a  finite  number  of  possible  states  for  an  MSP(k)  machine  for  G.  Then  since 
any  machine  has  only  finitely  many  states,  all  different,  with  only  a  finite 
number  of  connections  among  the  states,  the  result  is  .^mediate.  Q.E.D. 

Theorem  5.3  It  is  decidable  whether  or  not  an  LR(k)  grammar  G  is  k-trans- 
formable. 

Proof  The  decision  process  is  very  straightforward:  it  consists  of  starting 
with  the  canonical  LR(k)  machine  for  G,  constructing  all  possible  MSP(k) 
machines  for  G  from  ft,  and  tnen  seeing  whether  or  not  any  of  these  machines 
is  cy'le-free.  By  Theorem  3.52„  every  MSP(k)  machine  for  G  can  be  obtained 
by  starting  with  the  LR(k)  machine  and  performing  a  sequence  of  state-splittings, 
none  of  which  interfere  with  any  of  the  previous  ones.  We  also  know,  from  Corollary 
3.19,  that  it  is  possible  to  compute  all  possible  state-splittings  of  any 
given  state.  Thus  we  have  an  effective  procedure  to  compute  all  possible  MSP(k) 
machines  for  an  LR(k)  grammar  G.  We  begin  with  the  LR(k)  machine  for  G;  at 
each  stage,  we  will  have  some  MSP(k)  machine  for  G,  and  we  pick  some  inter¬ 
mediate  state  of  that  machine  which  does  not  dominate  any  base  state,  choose 
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some  splitting  of  that  state,  and  replace  the  state  by  that  splitting.  If 
there  is  no  such  intermediate  state,  or  no  splitting  of  it,  we  backtrack 
and  try  another  choice.  In  this  way  it  is  possible  to  exhaust  all  possible 
MSP(k)  machines.  Since  at  each  stage,  the  current  MSP(k)  machine  has  one 
more  base  state  than  the  machine  of  the  preceding  stage,  we  never  get  caught 
in  a  loop  which  constructs  the  same  machines  over  and  over.  Once  this  con¬ 
struction  phase  is  completed,  we  can  examine  each  of  the  MSP (k)  machines  and  see 
whether  some  one  is  cycle- free,  and  thus  complete  our  decision  procedure. 

Q.E.D. 

This  decision  procedure  is  not  quite  fs  ur.satis factory  as  it  seems  at 
first.  Although  it  does  blindly  create  all  MSP(k)  machines  for  G  (and  some 
particular  machines  possibly  several  times),  we  have  given  deterministic 
non-heuriscic  algorithms  for  determining  splittings  of  a  state  and  replacing 
a  state  by  a  splitting.  Ihis  decision  procedure  would  be  suitable  for 
execution  by  a  computer,  especially  since  it  would  only  have  to  be  done  once, 
when  trying  to  construct  an  LL(k)  grammar  for  the  language  under  consideration. 
Furthermore,  in  later  sections  we  shall  see  how  this  procedure  might  be  greatly 
improved  by  its  understanding  of  how  cycles  may  be  removed  from  a  machine  by 
state-splitting. 

Our  first  order  of  business,  however,  is  to  get  a  feeling  for  the  extent 
of  the  class  of  k- transformable  grammars.  Our  first  major  result  will  show 
that  they  include  all  the  LL(k)  grammars. 

A  word  about  notation.  The  lemmas  that  follow  are  very  technical  and 
their  proofs  are  tedious;  in  the  interests  of  simplifying  the  symbology  and 
making  the  proofs  more  readiable,  we  have  omitted  explicit  mention  of  the 
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right  context  H  .  It  is  to  be  assumed  that  it  is  present  however,  so 
that  FIRST^  of  a  string  will  always  have  k  symbols  in  it. 

Lama  5.4  If  A  -*  (Ij.a2(T)  and  B  ■*  are  essential  items  of  the  same 

* 

state  of  the  LR(k)  machine  for  G,  then  for  some  u  c  V^*,  5  =»  u  and 

S  t  uP2¥2*  Such  that  T  €  FIRSTkCir1)  and  to  €  FIRST^C^). 

Proof  Every  state  of  the  LF.(k)  machine  is  a  p-successor  of  the  initial  state 
for  some  p.  We  proceed  by  induction  on  the  length  of  p. 

If  =  0,  then  there  are  no  essential  items  in  q,  so  to  start  the 

induction  off  safely  we  must  consider  the  case  where  |p|  =1.  Then  q  is  a 

cr-successor  of  the  initial  state,  and  the  items  in  question  are  A  -*  cr.a2(T) 
and  B  -*  a.  P2(to).  Then  A  -*  .cxt2(t)  and  B  -*  .crp2(t*J)  are  items  of  the  initial 

ic 

state,  and  are  each  descended  from  S-items.  Then  we  have  S  ^  A  Y  and 

Sf  B  with  T  C  FIRST^Y^)  and  to  £  FIRSTk(Y2);  then  if  we  choose  u  3uch 

^ 

that  cr  u,  we  have  S  g  A  ^  ^  C  a2  ^1  L  U  a2  ^1  and  S  L  B  ^2  L  °^2  ^ 2 
* 

L  u  ^2*  38  desired. 

Now  say  |p|  ■=  n-'rl.  Then  p  «  acr,  whe^e  | CX I  =  n;  and  q  is  the  CF-succeseor 
of  q',  which  is  some  Ot-succassor  of  the  initial  state.  Then  the  items  of  q  in 
question  are  A  -*  C.^a.a2(T)  and  B  -*  P^O.  P2(w);  and  A  -*  a^.cxl2(T)  and  5  -* 

P  .oP2C-)  are  items  of  q'.  Now  either  both,  one,  or  neither  of  these  items 
is  an  essential  item  of  q'.  If  both  are,  then  by  induction  we  have 

*  it  * 

S  =»  u  oaJP.  and  S  f  u  ap,Y, ;  if  we  choose  u..  such  that  o^u.  and  set 

L  ^  i  L  z  -d  1  iii 

u2  =  uUl,  we  have  S  g  u2tt2^1  and  B  L  U2^2^2*  w*tdl  T  ^  FIRST^C^^)  and 

to  c  FIRSTk(Y2)  as  required.  If  A  -♦  C^.oa^u)  is  essential  in  q'  while 

B  -*  .  cp„(to)  is  not,  then  the  latter  is  descended  from  C  -*  which 

is  essential  in  q\  Then  D  »  ap2cp,  such  that  to  c  FIRSTk(cpy2P) .  By  induction. 
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S  2  uoa^  and  S  |  ul*^.  with  p  £  FIRSTk(Y2>.  Then  S  |  uDy^  2 
uCT02cPY2^2*  with  u  £  FIRST^cpY^).  ^  we  define  and  Uj  as  before,  then 

S  I  U2a2¥l  with  T  €  FIRSTk<V  and  S  |  u2P2CpV2Y2  *  with  “  €  FIRST^cpY^) , 

which  was  the  required  result.  The  final  cast1.,  where  A  -*  .act  (t)  and  B-+.crP2(u)) 

are  both  non-essential  items  of  q’,  follows  from  an  identical  analysis'.  Q. E.D. 

Wc  note  that  we  could  have  used  essentially  the  same  proof  to  prove 
the  following  more  general  result. 

Lemma  3.5  If  A  •*  ai*a2(T)  and  B  -*  Pj*P2(w)  are  any  two  items  of  the  same 

* 

state  of  the  LR(k)  machine  for1  G,  then  for  some  u  £  V,*  S  =>  uOt-Y.  and 

x  L  2  1 

S  f  up_¥_,  where  T  f  FIRST,  (¥.)  and  w  f  FIRST.  (Y„). 

L  2  Z  iC  I  K  2 

The  proof  of  this  lemma  follows  the  same  Inductive  structure  as  that  of 
the  previous  one,  and  the  details  are  only  slightly  different.  We  chose  to 
prove  explicitly  the  more  restricted  result  for  two  reasons:  the  proof 
techniques  employed  there  will  be  used  again;  and  primarily  because  it  has 
an  important  analogue.  Lemma  5.15,  which  will  be  used  later,  and  which  does 
not  have  a  more  general  form  analogous  to  Lenina  5.5. 

Defin'.ti.  5.6  The  core  of  an  item  A  -*  O^.C^Ct)  is  A  -*  a^.a,. 

Theorem  5.7  Let  J  be  an  LL(k)  grammar,  q  any  state  of  the  LR(k)  machine 
for  G.  If  1^  and  I2  are  essential  items  of  q  with  different  cores,  then 
FIRST^ip  D  FIRSTk(T,)  =  <fl. 

Proof.  Every  state  of  an  LR(k)  machine  is  a  p-successor  of  the  initial  state 
for  some  p.  We  proceed  by  induction  on  the  length  of  p.  If  |  P |  "  0, 

then  q  is  the  initial  state;  since  there  are  no  essential  items  in  the 
initial  state,  the  statement  is  vacuously  true.  (If  this  is  not  satisfying. 
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we  can  explicitly  apply  the  techniques  described  below,  to  start  the 
induction  for  |p|  =  1. ) 

Vow  suppose  the  theorem  is  true  if  )  P  I  =  n;  we  shall  show  it  true 

for  |p|  =  n+1.  Now  if  |P)  =  n+1,  we  can  write  P  as  a a,  where  |a|  =  n 

and  Of  'J  V.j,  .  Then  q  is  the  cJ-successor  of  q',  which  is  in  Ot-successor  of 
the  initial  state.  Thus  ever  '  essential  item  of  q  has  c  just  before  the 
dot.  Let  A  -*  ajtf.a^T)  and  B  -*  P^c  P^  (w)  be  any  two  essential  items  1^ 
and  I2  of  q;  then  A  -+a^.oa2^T)  and  B  -*  p^. <^2 are  ^tens  assume 

that  x  £  FIRST^(I^)  0  FIRST^C^).  and  vd.ll  prove  a  contradiction. 

There  are  a  variety  of  possibilities  for  the  items  A  Cij.cC^T)  and 
B  -*  P^.c^C10)  of  <j':  they  may  both  be  essential  items  of  q ' ;  exactly  one 
may  be  an  essential  item;  or  neither  may  be.  In  any  event,  since  x  6  FIRST^ 
(P2W)  A  FIRi'Tk(OC2T)  there  is  certainly  some  y  such  that  y  €  FIRST^c-j^)  0 
FIRSTV  (oa2T). 

If  they  are  both  essential  iteos,  then  they  have  different  cores,  since 

A  -*  aj0.a2(T)  and  B  *♦  p^o. P^(cj)  have  different  cores.  But  then  y  will  be  in 

FIRST,  of  two  essential  items  of  n1  Lth  different  cores,  whi^h  contradicts 
k 

the  induction  hypothesis,  since  q'  . r  an  (l-successcr  of  the  initial  state 
where  |a )  =  n. 

Next  assume  A  -*  a^.cG^T)  is  an  essential  item  of  q',  whi’e  B  -*  .op^(w) 

is  not.  Then  B  -♦  .  is  not  the  descendant  of  any  item  with  the  same 

core  as  A  -*  ai*(*12vT)>  since  that  would  make  a  left  recursive,  which  is 

impossible  in  an  LL(k)  grammar.  Thus  B  -*  ,op,.(u)  is  a  descendant  of 

^-♦Yj*Y2(P)»  an  essential  i-CP-i  of  q'  with  a  different  core  from  that  of 

A  -♦  ai«c<i2(T).  Therefore  y  €  FIRSTLY 2P),  which  means  that  y  is  in  FIRST^ 

of  both  0  -♦  y, • Y  (P)  and  A  -♦  a  .  cao(T),  which  violates  the  induction  hypothesis. 
12  1  £ 
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Finally,  suppose  A  -»  .<X12(T)  and  B  -*  .  oP2(w)  are  both  aon-essential 

items  of  q'.  We  shall  refer  to  these  items  as  I.  and  I,.  If  I_  and  I. 

3  4  3  4 

are  descended  from  essential  items  of  q'  with  different  cores,  then  y  is 
in  FIRSTk  of  each  of  these  items,  which  contradicts  the  induction  hypothesis. 
Thus  Ig  and  1^  are  descended  from  items  C  -*  y,  .y2(P^)  and  C  -»  Y^*V2^2^* 

We  observe  that  neither  of  1^  and  can  be  a  descendant  of  he  other,  for 
that  wouM  make  a  left  recursive.  Thus  there  are  chains  of  items  of  q', 

■®0»^i  »*  *  *^n  and  ^o*^i»*  *  *  ,  »  such  that  Jg  “  C  -*  "q  *  C  *♦  ^1*^2^2^* 

J  =  I_,  and  K  -I,.  Let  j  oe  the  smallest  number  such  that  the  core  of 
n  3  m  4 

is  not  the  same  as  the  core  of  K. ;  since.  and  I.  have  different  cores, 

1  J  3  <+ 

and  since  neither  chain  is  a  prefix  of  the  other,  there  iust  be  such  a 

j  >  0.  Then  is  E  "*  and  is  E  -*  *^2^2^’  the  ^^“^and  sides 

are  the  same  since  ^  and  had  identical  cores.  Then  E  4  ^  and 

E  -*  T2  are  different  rules  of  G.  Since  1^  is  a  descendant  of  E  -» 

* 

and  since  y  p  FIRST^Ig).  it  means  that  ^  such  that  y  =  z^rr./k;  and 

ic 

similarly  for  J.^.  Thus  for  some  u,  we  have  S  |»  uE^t^V^  |>  uVjHjV^  l  UZ1^]V1 

*  *  . 
and  S  £  uEti„v2  ^  u^2n2V2  L  UZ2Tf2V2  for  SOme  V1  and  V2*  where  Zj^]/K  =  z2rr1/'k; 

and  this  violates  the  definition  of  LL(k)  grammars. 

This  is  the  final  case  and  completes  the  induction.  Q.E.D. 

In  proving  the  results  of  this  action,  we  shall  use  a  slightly  more  general 
definition  of  a  legal  state-splitting  and  of  MSF(k)  machine  than  we  have  been 
using.  The  change  i. j  so  minor  however,  that  all  our  important  results  are 
valid  for  the  new  definition  and  require  only  the  mo3t  trivial  alterations  in 
their  proofs.  The  change  consists  of  changing  L^Cq,  A 

out  the  definition  of  a  state  splitting;  and  specifying  in  the  definition  of 


t)  to  1^0?,  A1)  through- 
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an  MSP(k)  machine,  that  if  q  in  a  base  state  and  q^,  is  an  image  of  q  under 
gj»  then  (x  l  gx (q ,  x)  *  q^}  =  ^(q,  82^))*  The  meaning  of  this  revision 
is  easily  explained.  Originally  we  required  that  there  oe  seme  .A^,  item  in 
each  L^q',  A^)  chain,  and  that  the  set  of  strings,  which  were  to  occasion 
the  prediction  of  an  A^,  f^om  the  base  of  a  splitting  of  q ' ,  was  to  be  equal 
tc  the  set  of  all  lookaheads  that  q'  might  have  seen  and  that  could  have 
come  from  A^  (i.e. ,  L^Cq',  A^,)).  But  now  we  relax  this  requirement  somewhat. 

We  now  say  that  a  state-splitting  is  legal  if  the  possible  lookaheads  from 
those  Ajj's  which  are  being  predicted  (i.e,  those  A^'s  which  are  the  post-dot 
component  of  some  item  in  the  base  state)  always  ensure  that  a  .A^  will  be 
found;  that  is,  we  only  need  to  oreak  ^(B,  A^, )  chains  in  order  to  define 
the  splitting.  It  is  not  necessary  for  a  lookahead  that  comes  from  any  Aj, 
to  indicate  the  presence  of  A^  in  the  state,  but  only  for  those  lookaheads 
that  come  from  an  A^  in  the  base  state;  it  is  this  set  of  lookaheads  which 
will  cause  an  MSP(k)  machine  containing  the  splitting  to  make  the  predictive 
transition.  Conceptually,  this  is  at  extension  of  a  similar  concept  that 
occurred  in  our  earlier  definition,  in  the  case  of  a  left  recursive  A^. 

Making  the  stipulated  changes  in  the  definitions  is  completely  inessen¬ 
tial  to  the  main  course  of  our  duv-lopment.  All  major  results  need  only  have 
the  same  changes  made  in  their  slater  *nts  or  proofs  in  order  to  maintain 
their  validity.  The  only  results  that  cannot  be  so  easily  updated  are  3.31 
through  3.34.  TI  sse  results  were  not  very  important  and  were  only  provided 
to  give  some  feeling  as  *-o  what  constituted  a  legal  state-splitting;  slightly 
different  versions  of  than  can  be  obtained  for  the  revised  definition  of  a 
splitting.  The  operation  of  the  MSF(k)  machine  is  unaltered,  as  is  the 
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grammatical  derivation  procedure  from  cycle- free  machines;  and  the  trans¬ 
formed  granmar  has  all  the  properties  we  proved  in  Chapter  4.  From  now  on 
we  shall  be  using  these  new  definitions  5f  state  splitting  and  MSP(k)  ma¬ 
chine;  and  a  k-transformable  grammar  means  one  for  which  such  a  cycle-free 
MSP(k)  machine  exists.  It  is  true  that  the  class  of  k- transformable  gram¬ 
mars  under  the  new  definition  is  larger  than  the  cl&dS  using  the  old  de- 
finicion,  for  the  changes  we  have  made  do  enlarge  the  class  of  legal  state- 
splittings. 

Theorem  5.8  If  q  is  a  non-final  state  cf  the  LR(k)  machine  for  an  LL(k) 
granmar  G,  then  there  is  a  splitting  (B,  Q)  of  q  such  that  B  consists  of 
precisely  the  essential  items  of  q. 

Proof  We  directly  define  the  splitting  (B,  Q).  If  A  is  a  non-terminal  post¬ 
doc  component  of  some  essential  item  of  q,  then  there  is  an  initial  state 
associated  with  A,  consisting  of  all  descendants  of  all  .A-essential  items 
of  q.  An  initial  state  P^  is  created  for  each  such  nonterminal  A^,  and  B  is 
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defined  as  precisely  the  essential  Items  of  q. 


We  must  show  that  this  proposed  splitting  is  indeed  legal.  First 

.a^  n 

essential  item  1^,  and  some  essential  .A^  item  I2,  such  that  FIRST^ 

(Ij)  FIRST^dj)  f*  0.  But  since  /  A^ ,  1^  and  Ij  must  have  different  cores, 
so  by  the  preceding  result,  FIRST^d^)  A  FIRSTk(I2)  =  0. 

Next,  since  G  is  LL(k),  there  can  be  no  .A^  items  in  F^,  for  that  would 
make  A^  left-recursive.  Hence  FOLLOW^(A^,B)  H  FOLLOW^ (A^.P^)  =  0. 

Finally,  cons!ler  any  L^(B,A^)  chain  c.  Then  this  chain  must  begin  with 
a  .A^-essential  item,  for  otherwise,  there  would  be  two  essential  items  with 
different  cores  but  with  intersecting  FIRST^'s.  We  define  H^(c)  to  be  the 
first  item  in  c,  the  essential  item,  and  H2(c)  to  be  the  rest  of  c.  We  have 


Lj^BjA^)  f  0,  for  some  i  and  j.  Then  there  is  some 


oppose  that  (B 


to  show  that  these  definitions  of  Hj  and  1L  satisfy  the  defining  equations 
fot  B  and  P^.  First  of  all,  H^(c)  C  B  for  any  c,  by  definition  of  B  as  all 


essential  items  of  q.  Then  if  I  is  such  that  FIRST^d)  <£  I  IL^B.A^),  I  must 
be  a  terminal  essential  item,  in  which  case  It  will  be  in  B.  On  the  othe; 


hand,  an  item  I  is  iu  B  if  and  only  if  it  is  an  essential  item;  if  It  is  not 
a  terminal  essential  item,  then  it  is  in  H^(c)  for  some  c;  while  if  it  is 
a  terminal  essential  item,  then  PIRST^(I)  does  not  intersect  FERST^(I'),  for 
any  non-terminal  essential  item  I',  by  the  previous  lemma,  and  thus  is  not 
contained  in  U  1*^(8, A^).  Thus  the  defining  equation  for  B  is  satisfied. 

If  I  is  any  item  in  P^,  then  I  must  be  a  descendant  of  some  essei  cial 
.A^-item,  and  hence  will  be  In  H. *’<■’)  for  some  c.  On  the  other  hand,  if  I 
is  in  H2(c)  for  some  L^CBjA^) -chain  c,  then  c  must  begin  with  an  essential 
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•  A^-itcm  by  the  previous  leoma,  end  so  I  will  be  a  descendant  of  some  essential 
.Aj.“ltem.  Therefore  the  defining  equation  for  is  satisfied,  and  we  are 
done.  Q.E.D. 

Theorem  5.9  Let  G  be  an  LL(k)  grammar,  and  q  be  any  state  of  the  LR(k)  machine 
for  G.  If  q'  is  any  state  whc.*:e  essential  items  are  a  subset  of  the  essential 
items  of  q.  then  there  is  a  splitting  of  q'  such  that  the  base  state  is  pre¬ 
cisely  the  essential  items  of  q1. 


Proof  We  define  the  splitting  *n  this  case  the  same  way  as  we  did  in  the 
preceding  case.  If  1^  and  I?  are  essential  items  of  q'  wi  ch  different  cores, 
then  they  are  also  essential  items  of  q  with  different  ceres,  so  FTRST^(I^) 

H  FIRSTjc(I2)  -  <b.  Furthermore,  since  no  essential  .A^-item  in  q  has  any 
.Aj,-items  as  descendants  in  q,  the  same  holds  true  in  q'.  These  are  the  only 
two  facts  needed  to  make  the  preceding  proof  go  through  for  q'  as  well. 

Q.  £.  D. 


Theorem  5.10  Let  Mq  be  the  LR(k>  machine  for  the  LL(k)  graumar  G.  Then  there 
exists  a  cycle- free  MSP(k)  machine  M  for  G. 


Proof  We  shall  give  a  sure-fire,  though  somewhat  mindless,  algorith  .or 
constructing  M  from  Mq  by  a  sequence  of  state-splittings.  In  later  sections 
we  shall  see  how  this  procedure  could  be  refined. 

1)  Set  n  **  1. 

2)  Find  an  intermediate  state  q  such  that  q  is  accessible 

from  the  starting  state  by  a,  where  |a|  =  n;  if  there 

is  no  such  q,  go  to  step  A 

3)  Replace  q  by  a  splitting  (B,Q)  of  q;  such  that  B  is 
precisely  the  essential  items  of  q;  go  to  step  2 
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4)  If  there  are  any  intermediate  states  still  unsp.it  in  the 
machine,  set  n  *  r.+l  and  go  to  step  2;  otherwise,  stop. 

This  procedure  creates  a  sequence  of  HSP(k)  machines  for  G,  starting 
with  Mq,  such  that  each  is  obtained  from  the  preceding  one  by  the  performance 
of  a  state-split  ting.  We  claim  that  this  procedure  is  well-aefined,  that  it 
terminates,  and  results  f.n  a  cycle-free  machine. 

First  of  all,  we  recall  that  any  state  of  an  MSP(k)  machine  for  G  is 
contained  in  some  state  of  the  IH(k)  machine  for  G,  so  that  any  such  inter¬ 
mediate  MSP(k)  state  can  indeed  be  split  in  such  a  way  that  the  base  of  the 
Splitting  is  just  the  essential  items  of  the  state;  this  is  by  the  previous 
result.  So  at  each  stage  the  procedure  is  well-defined. 

Now  to  show  that  it  terminates.  First  of  all,  since  there  are  only 
finitely  many  MSP(k)  machim  a  for  G,  there  is  some  value  N  such  that  every 
state  of  every  MSP(k)  machine  for  G  is  accessible  from  the  starting  state 
by  a  path  of  length  less  than  N.  Furthermore ,  if  at  some  point  in  the 
procedure,  the  test  indicates  that  there  are  no  more  unsplit  states 
accessible  by  a  path  of  length  m,  then  there  are  never  any  suchunsplit  states 
later  in  the  procedure.  Thus  if  at  some  stage,  there  are  unsplit  states, 
they  are  accessible  by  paths  longer  than  the  current  value  of  n  and  shorter 
than  N,  and  so  will  be  considered  later.  Thus  this  procedure  torminates 
_t  the  latest  when  n  reaches  the  value  j f  N;  and  when  it  does  terminate, 
the  resulting  MSPCk)  'machine  consists  only  of  initial  states  and  of  base 
states  with  only  essential  items.  We  claim  that  this  machine  is  cycle- 
free.  If  there  were  a  cycle  in  this  machine,  it  would  have  to  contain 
only  base  states,  since  an  initial  state  is  not  a  successor  of  any  other 


state 
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Now  If  there  were  some  cycle  involving  base  states,  let  q  be  any 
base  state  in  such  a  cycle.  Recall  that  q  and  all  other  states  in  the 
cycle  contain  only  essential  items.  If  the  length  of  the  cycle  in  tfhich 
q  is  involved  is  of  length  m,  then  it  must  be  the  c&3e  that  the  essential 
items  of  q  can  be  obtained  from  themselves  by  moving  the  dot  in  eachm  places 
to  the  right.  But  this  is  absurd,  so  there  can  be  no  such  cycle,  and  we  are 
done.  Q.E.D. 

Corollary  5.11  If  0  is  LL(k),  then  G  is  k-transformable. 

This  result  has  interesting  consequences  on  a  number  of  different  lev -Is. 
First  of  all,  it  enhances  the  respectability  of  our  definitions.  We  have  gone 
to  great  pains  to  describe  a  class  of  grammars  that  can  be  transformed  into 
LL(k)  form,  and  it  is  reassuring  to  know  that  this  class  at  least  Includes  all 
those  grammars  which  are  already  LL(k).  But  there  are  more  substantive 
issues  as  well. 

Corollary  5.12  The  class  of  languages  recognized  by  the  class  of  cycle- 
free  MSP(k)  machines  is  precisely  the  LL(k)  languages. 

In  other  words,  for  every  LL(k)  language^  ,  there  is  a  grammar  G  si -h 
that  L(G)  =  o £  and  such  that  there  is  a  cycle-free  MSP(k)  machine  for  G. 

Next  we  wish  to  demonstrate  an  even  further  extent  of  the  k-transform- 
able  granmar8  and  the  attractiveness  of  our  transformation.  There  is  one 
large  decidable  class  of  grammars  which  are  not  LL(k)  and  which  can  be 
transformed  into  LL(k)  form  by  the  application  of  a  precise  algorithm:  these 
are  the  LC(k)  grammars  of  Rosenkrantz  and  Lewis.  Intuitively,  these  are 
gramnars  that  can  be  parsed  in  a  mixed  hybrid  of  bottom-up  and  top-down 
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parsing.  Hie  basic  idea  it?  that  the  identity  of  a  rule  used  in  a  derivation 
can  be  established  from  the  left  corner  of  the  rule  (the  first  symbol  on  its 
right  hand  side)  and  from  the  lookahead  following  the  left  corner.  The 
details  are  complex,  and  a  complete  discussion  is  given  itn  [19].  It  is  our 
gpal  to  show  that  the  k-transformable  grammars  strictly  include  the  class  of 
LC(k)  grammars. 

Ve  review  the  definition  of  LC(k)  grammars. 

Definition  5.13  The  left  corner  of  the  production  A  -*  Y  is  the  first  symbol 
of  Y. 

•fc 

Definition  5.14  If  A  is  a  nonterminal,  we  say  that  S  ££  uAY  where  u  f 

and  Y  C  (V  U  V  )*  if  S  f  uAY  and  if  A  is  not  tha  left  comer  of  the  last 
N  I  L 

production  used  in  the  derivation. 

J-ppma  S.  15  If  A  -*  CL^.a2(x)  and  B  -»  Pj.P2(^)  are  essential  items  of  the  same 

•ff 

state  of  the  lR(k)  machine  for  G,  then  for  some  u  £f  VT*,  S  £»,  uCXjY^  and 
S  £c  U^2Y2  8uch  that  T  €  FIRSTkCV  and  w  f  FIRST^(Y2>. 

proof  The  proof  of  Iesmsa  5.4  can  be  interpreted  as  a  proof  of  this  lemma  as 
well.  Q.E.D. 


We  state  without  proof  the  following  elementary  result. 

* 

Lamm  5.1b  If  B  -*  ,p(u)  is  descended  from  A  -*  a^.0a2(p),  then  C  g  BT|  for 
some  1)  £  VT*  such  that  w  f  FIRSTk('na2P) 


Definition  5.17  G  is  LC(k),  k  >  0,  if  the  following  three  conditions  hold: 
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1)  given  t  €  VT*,  x  f  VT  ,  A  f  VN>  there  io  at  most  >ne 

production  of  the  form  B  -*  cp  where  tp  begins  with  <t 

terminal  sybmol  or  is  c  such  that  S  =>,  t  A  ¥  and 
'k  *  LC 

A|  B  p  for  some  ¥  €  (V^  U  VT)*  and  p  g  VT*,  where 
x  £  FIRSTk(«ppY) 

2)  given  t  <=  VT*,  x  <:  V^,  A  f  VN,  and  C  6  VN,  there  is 

at  most  one  production  of  the  form  B  Cep  such  that 
*  * 

S  T=>  uAY  and  A  »  Bp  for  some  u,  pf  V  *  and  Y  g  (VM  U  V_)*, 
Ll>  ^  X  T  NT 

such  that  u  C  =>  t  and  x  g  FIRST^CeppY). 

3)  if  in  the  case  where  C  =  A,  there  is  a  production  B  -*  Atp 

satisfying  2),  then  there  do  not  exist  u'  and  ¥'  such  that 
k  k 

S  LC  U'A'i^,  30(1  SUCh  thSt  U’A  **  t  and  x  €  FIRST^ (Y 1 ) . 


We  have  slightly  altered  the  notation  of  this  definition  from  how  it 
appears  in  [19],  in  order  to  make  it  easier  for  us  to  handle,,  but  we  have  not 
made  any  significant  changes. 


Theorem  5.18  Let  G  be  any  LC(k)  grammar,  q  any  state  of  the  LR(k)  machine 
for  G.  If  1^  fd  I2  are  essential  items  of  q  with  different  cores,  then 
FIRST^dj)  H  FIRSTk(I2)  =  <t>. 

Proof  The  structure  of  the  proof  is  identical  to  the  proof  of  Theorem  5.7, 
and  most  of  the  details  are  the  same  as  well.  We  shall  not  recopy  that  proof, 
but  shall  only  discuss  those  parts  of  the  other  proof  that  utilized  the  fact 
that  G  was  LL(k)  in  that  case.  There  are  two  such  cases,  where  it  was  argued 
that  B  -*  .  cj^O0)  could  not  be  descended  from  A  -♦  OL^.OC^CP)*  an<*  where  it  was 
argued  that  A  -♦  .  oc^d)  an<*  ®  -*  .crpjC0,3)  could  not  be  descended  from  essential 
item.';  of  q  with  the  same  core.  We  shall  reconsider  these  two  cuses.  Recall 
that  the  assumption  was  that  x  c  FIRST^O^'d  H  FIRSTk(P2w)»  an<^  that  y  f  FIRST^ 
(0^2^)  A  FIRST^oO^t);  this  was  to  lead  to  a  contradiction. 


240 


Suppose  that  A  -*  (^.oO^Ct)  is  an  essential  item  and  that  B  -*  .  oPjO*1) 
is  not.  If  B  -*  .  cPjO*1)  is  descended  from  an  item  with  a  different  core 
from  that  of  A  -*  (X^.cc^Ct),  we  Immediately  reach  a  contradiction,  as  we 
did  in  the  proof  of  Theorem  5.7.  Assime  then  that  B  -*  .  0P2OJ)  is  descended 
from  an  item  of  the  form  A  -*  a^.oc^Cp)  in  q'.  Then  a  is  a  nonterminal; 
call  it  C. 

Since  A  -*  d^.OdjCp)  is  an  essential  item  of  q',  we  have  S  uOC^* 
with  p  £  FIRST^C^);  since  B  -*  .  CP2(W)  is  descended  from  A  -*  Ct^.CC^  (p),  we 
have  C  |  B  I],  with  oj  €  FIRST^T)^*).  Then  since  x  €  FIRST^p^),  we  have 
x  £  FIRSTjt(P2^12^^*  Thus  f°r  t,  x,  and  C,  there  is  a  rule  B  -»  C  ?2  such  that 

ic  jc  -fa 

S  £*  uCy,  CtB%  and  uC  =»  t,  where  r.  f  FIRST^^^Ty).  011  the  other  hand, 
since  A  -*  a,.0a_(T)  is  an  essential  item  of  q',  we  have  S  uOa„Y' ,  with 

i  ^  LL 

T  <c  FIRST^C^' );  since  x  €  FIRST^C^T) ,  this  means  that  S  £»,  uCy',  with 
ic 

uC  =»  t  and  x  £  FIRSTLY1).  But  these  two  statements  contradict  the  thiru 

clause  of  the  definition  of  LC(k)  grammars. 

To  cry  to  make  the  foregoing  a  little  clearer,  consider  the  two  trees 

of  Figure  5.1.  They  illustrate  the  two  situations  just  delineated,  and  do 

indeed  violate  the  definition  of  LC(k)-ness. 

Now,  we  must  consider  the  case  where  A  -*  .oa  (t)  and  B  -»  .crp_(w)  are 

2  1 

both  non-essential  items  of  q'.  If  they  are  descended  from  items  with 
different  cores,  then  y  is  in  FIRST^  of  each  of  those  items  and  we  are  done. 
Assume  then  that  they  are  descended  from  essential  items  with  the  same  core, 
namely  C  *  an<*  ®  "*  Y^>DY2(P2^  resPectively.  Since  these  two  items 

are  in  q',  we  have  S  uD^^l  and  S*  \iDy^2,  wi':h  Pj_  €  FIRST^O^).  If  0  is 
a  non-terminal,  then  for  a  given  t,  x,  D,  and  a,  we  have  two  productions 
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t  x 


Figure  5.1 

A  -»  oa2  and  B  -»  crp2,  such  that  S  £*,  uDy-j,  D  j»  AT^  where  ucr  =»  t  and 

x  £  FIRST^ (a2^Y3 )  and  such  that  S  lc  uDy4»  D  R  where  uCT  15  fc  and 

x  €  1IRST  (Po^,oYa)»  which  violates  the  second  clause  of  the  LC(k)  deflni- 
k  2  t  4 

tlon.  Once  again  we  include  an  illustrative  figure^  Figure  5.2. 
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t  x 


» _ l  L. 

t  x 


Figure  5. 2 


Finally,  we  must  consider  the  case  where  A  -*  .  < t)  and  B  -*  ,crp2(u) 

are  descended  from  essential  items  of  q'  with  the  same  core,  and  a  is  a 

'fc  'ft 

terminal  symbol.  Then  as  in  the  preceding  case,  S  £»  uDy^,  D  k  AT)^  with 
T  c  FIRSTk('niY3),  and  S  =*,  uDy^,  D  |  BT|2  with  to  f  FIRSTj^CTI^).  3ut  since 
y  f  FIRST^Ccrp^10)  n  FIRST^OC^T) ,  this  means  that  given  u,  y,  and  D,  there 
are  two  productions  beginning  with  terminals,  namely  A  and  B  -*  crP2, 

such  that  sS  uDy3,  D  |  A  T)1 ,  with  y  C  FIRSTk  (OX^Yj),  and  such  that 

it  it  * 

S  uDy4,  d  R  A  T)2,  with  y  f  FIRSTk(crp2T2Y4).  But  this  contradicts  the 
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first  clause  of  the  LC(k)  definition.  Figure  5.3  is  included  for  illustrative 
purposes. 


Theorem  5.19  If  q  is  a  non-final  state  of  the  LR(k)  machine  for  an  LC(k) 


grammar  G,  then  there  is  a  splitting  (B,Q)  of  q  such  that  B  consists  of 
precisely  the  essential  i  tv-ms  of  q. 

Proof  We  define  (B,Q)  as  we  did  in  the  earlier  theorem.  Namely,  if  there  is 
an  essential  .A-item  in  q,  then  there  -  a  predictive  state  for  A,  consisting 
of  the  descendants  of  all  essential  .A-itans  in  q. 

The  proof  of  the  earlier  theorem  works  almost  exactly  here,  because  of 
our  preceding  rejult  The  only  thing  we  have  to  demonstrate  differently  in 
this  case  is  that  FOLLOW^ (A^ ,B)  (1  FOLLOW^ (A^,P^)  =  0,  because  here  FOLLOW^ 
might  not  be  empty.  But  this  fact  follows  from  the  LC(k)-ness  of  G, 
as  follows. 

If  :c  f  FOLLOW^ (A^ , B)  H  FOLLOW^ (A^ . ) ,  then  there  is  some  essential 
-A^-item  C  -*  (-^.A^a^T)  with  x  6  FIRST^C^)  >  and  some  non-essential  item 
D  -♦  with  x  €  FIRSTj^P^’)  such  that  the  latter  is  a  descendant  of 

some  essential  .A^-item  E  -*  Yj-AjT^P)* 

Then  we  have  S  ^  uA^'*',  with  p  f  FIRSTLY),  and  S  £>,  uA^O,**  with 
T  £  FIRST^C?' ).  Further,  we  know  that  A^  g  DT|,  with  oj  £  FIRST^T^^'  ); 
hence  x  €  FIRST^?^^' ).  Thus  for  a  given  t,  x,  and  A^,  we  have  a  rule 

'fc  "ic  + 

D  -»  A^?2  with  uA^y  and  A^  g  DTI  such  that  uA^  =»  t  and  x  f  FIRST^02<nY)« 

"fc  }t 

But  at  the  same  time,  we  have  S  uA^1?'  Such  that  ^i  ^  t  and  x  r  FIRSTk 
(a.-Y').  This  contradicts  the  third  clause  of  the  LC(kl,  ajfinition,  aid  30 

t 

ve  are  done.  Q*  Z.5). 

We  illustrate  this  argument  in  Figure  5.4. 
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t  x 


Figure  5.4 

Thee  an  5.20  If  G  is  LC(k),  then  G  is  k- cransformable. 

Proof  The  algorithm  of  Theorem  5.10,  for  constructing  a  cycle-free  MSP(k) 
machine  for  an  LL(k)  grammar  G,  only  used  the  property  that  any  state  of 
che  LR(k)  machine  for  G  could  be  split  in  such  a  way  that  the  base  of  the 
splitting  contained  only  essential  items.  Since  we  hav_  just  shown  that 
this  , roperty  applies  as  well  to  LCvk)  grammars,  that  algorithm  will  work 
for  LC(k)  grammars  as  well.  Q.E.D. 

This  means  that  our  transformation  is  at  least  as  effective  as  t're  one 
in  r 19] .  Actually,  since  every  LI(k)  grammar  is  LC(k),  Corollary  5.11  fellows 
tram  Theorem  5.20,  but  we  felt  that  the  former  prt  A  was  more  understandaole 
than  the  latter,  30  both  were  included. 

Consider  the  LR(0)  grammar  S  -*  bAc,  A  -•  ABx,  A  -»  ABy,  A  -♦  a,  B  -*  Bd, 

B  -*  d.  This  grammar  can  be  shown  not  to  be  LC(k)  for  any  k;  intuitively, 
the  reason  is  that  even  after  seeing  past  the  corners  of  A  -»  ABx  and  A  -*  ABy, 
no  amount  of  lookahead  can  distinguish  these  two  rules.  However,  his  grammar 
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is  O-transformable;  its  LR(O)  machine  is  given  in  Figure  5.5,  and  is 
obviously  cycle-free. 


Figure  5. 5 


The  grawnar  derived  from  this  nachine  is  given  in  Figure  5.6  as  another 
illustrative  exanple. 


(b,P>  -*  b(S,b) 
(S,b)  -*  a(S,ba) 
(S,ba)  (S,bA) 
(S,bA)  J  c(S,bAc) 
^S,bA)  4  d(S.bAd) 
(S.bAc)  -*  (S.S) 
(S,bAd.  *♦  (S.bAR) 


(S.S)  -♦  £ 

(S.hAB)  -♦  d  (S  ,bABd) 
(b.bAB)  -♦  x(S,bABx) 
(S.bAB)  -*  y(S,oABy) 
(S  bABd)  -*  (S.bAB) 
(S.bABx)  -*  (S.bA) 
(S.bABy)  -♦  (S.bA) 


Figure  5.6 
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This  grammar  Is  LL(1),  as  advertised.  In  order  to  reduce  its  size, 
we  construct  its  nonterminal  graph. 


Figure  5.7 

For  absolute  minimality  of  a  minimal  version,  we  eliminate  all  nonterminals 
other  than  (S,£)  and  (S,bAB),  resulting  in  an  LL(3)  grammar  with  two  nonter¬ 
minals  and  seven  productions.  However,  if  we  additionally  include  (S,bA), 
the  result  is  an  LL(1)  granmar  with  three  nonterminals  and  six  productions. 
((S,bA)  was  chosen  because  of  its  proximity  to  a  branching  in  the  graph.) 
Renaming  the  nonterminals  as  S,  A,  and  B,  this  grammar  is  S  -*  baA,  A  c, 

A  -»  dB,  B  -♦  dB,  B  -»  xA,  B  -*  yA. 

In  any  event,  since  we  have  seen  a  O-transformable  granmar  which  is  not 
LC(k)  for  any  k,  and  since  it  is  clear  that  a  O-transformable  grammar  is 
k-transformable  for  any  k,  wc  have  the  following: 
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Theorem  5.21  The  class  of  k- transformable  grammars  strictly  Includes  the 
class  of  LC(k)  grammars. 

Looking  back  at  our  proofs  that  LC(k)  and  LL(k)  grammars  are  k-transform- 
able,  we  see  that  the  critical  point  in  each  case  was  that  any  state  of  the 
LR(k)  machine  for  an  LL(k)  or  LC(k)  grammar  can  be  split  in  such  a  way  that 
the  base  of  the  splitting  consists  precisely  of  the  essential  items  of  the 
state  being  split.  We  might  define  a  class  of  grammars  which  satisfy  this 
property,  namely  that  any  state  of  the  LR(k)  machine  foi  a  grammar  in  the 
class  can  be  split  in  such  a  way..  Such  a  class  would  fall  between  the  LC(k) 
and  k-transformable  granmars:  in  fact,  this  class  has  been  proposed  by 
Rosenkrantz,  Lewis,  and  Stearns  [15]  under  the  jocular  name  of  "fingernail 
grammars".  Such  grammars  have  the  property  that  if  a  prefix  of  the  right- 
hand  side  of  some  rule  has  been  recognized,  then  k  symbols  of  lookahead  will 
identify  what  the  next  symbol  of  the  rule  must  be;  or  when  the  rule  is  over, 
what  its  left-hand  side  is.  They  can  be  thought  of  as  a  generalization  of 
LC(k)  grammars,  and  the  k-transformable  grammars  are  a  generalization  of  this 
class.  This  was  the  original  genesis  of  the  ideas  in  this  thesis,  which  led 
in  a  roundabout  way  to  the  work  reported  herein. 
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5.2  Strategies  for  State-Splitting  and  Cycle  Breaking 

As  we  have  stated  a  number  of  times,  a  possible  application  cf  the  trans¬ 
formation  we  have  developed  would  be  in  an  early  stage  of  an  automatic  com¬ 
piler  waiting  system,  where  a  human-defined  programing  language  grammar 
would  be  transformed  into  a  \arsion  more  suitable  for  actual  use  in  a  compi¬ 
ler.  But  the  point  of  departure  for  the  application  of  our  transformation 
must  always  be  some  particular  cycle-free  MSP(k)  machine  for  the  original 
grammar.  While  we  know  that  every  cycle-free  MSP(k)  machine  for  a  grammar  can 
be  obtained  by  starting  with  the  lH(k)  machine  and  performing  some  sequence 
of  state  splittings,  such  assurances  are  a  little  too  vague  to  be  of  ncch  prac¬ 
tical  value.  We  now  turn  out  attention  to  the  problem  of  how  to  split  states 
in  an  intelligent  manner  in  order  to  obtain  a  cycle-free  machine,  and  speci¬ 
fically,  how  to  get  a  cycle- free  machine  whose  derived  grammar  exhibits  a 
variety  of  desirable  properties.  The  development  of  this  section  will  be  some¬ 
what  less  formal  than  the  foregoing;  we  shall  highlight  what  seem  to  us  to  be 
ti."  key  issues  of  the  problem.  Our  goal  is  to  develop  a  feeling  for  good  ap¬ 
proaches  to  this  area,  which  could  be  incorporated  in  a  heuristic,  (rather  than 
fully  algorithmic)  way  into  the  hypothetical  compiler-compiler  we  have  men¬ 
tioned. 

As  we  saw  in  the  previous  section,  every  intermediate  state  of  an  HSP(k) 
machine  for  an  LL(k)  grammar  G,  can  be  split  in  such  a  way  that  the  base  of 
the  splitting  consists  precisely  of  the  essential  items  of  the  state  being 
split.  Then  we  saw  now  by  starting  with  the  LR(k)  macnine  for  G,  and  by  blind¬ 
ly  splitting  states  in  this  way,  we  could  be  assured  of  eventually  ending  up 
with  a  cycle-free  MSP('c)  machine.  This  is  precisely  the  sort  of  magic  which 
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we  wieh  to  avoid.  We  want  to  develop  some  deterministic  strategies  whereby 
the  cycles  of  a  machine  may  be  deliberately  destroyed,  rather  than  accidentally 
as  they  were  in  the  preceding  section.  So  let  us  start  at  the  beginning,  and 
ask:  how  is  it  possible  for  the  performance  of  a  state-splitting  to  cause  the 
elimination  of  a  cycle  ic  an  MSP(k)  machine,  and  how  can  we  choose  such  a 
splitting? 

Let  us  consider  a  simple  stylized  cycle,  such  as  the  one  in  Figure  5.8. 

Two  states  might  occur  in  this  configuration  in  an  MSP(O)  machine.  How  can 
we  split  either  of  these  states  to  destroy  this  cycle?  Since  t^'e  CTj“ 

svvrc  jssor  of  q^,  it  means  that  the  essential  items  of  are  obtained  from 
some  of  the  .  items  of  q^,  by  moving  the  dot  one  place  to  the  right. 


Figure  5.8 

Now  suppose  we  were  able  to  split  q^  into  a  bipartite  9tate-splitting,  such 
that  none  of  the  .  items  of  q^  were  in  the  base  state  of  the  splitting.  Then 
they  would  all  have  to  be  in  the  initial  state,  and  there  would  be  no  <jj- 
transitions  out  of  the  base  -"ate.  However,  the  essential  items  of  the  base 
of  this  splitting  would  be  the  same  as  the  essentials  of  q^,  so  if  we  replaced 
q^  by  this  splitting,  the  base  would  become  the  ^“successor  of  q^.  The  pic¬ 
ture  that  would  result  is  given  in  Figure  5.9,  and  presto  the  cycle  i3  gone. 
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figure  5.9 

Hits  jrocees  is  known  as  breaking  a  cycle  by  performing  a  state-splitting. 

The  concept  involved  is  simple  enough.  If  is  the  a^-successor  of  q^  in  a 
cycle,  then  the  essential  icems  of  q 2  are  drived  from  seme  of  the  items  of 
q^;  if  q^  can  be  split  so  that  these  items  do  not  appear  in  the  base  state 
of  the  splitting,  q2  will  no  longer  be  the  successor  of  the  state  holding 
q^'s  place  in  the  machine,  and  the  cycle  is  broken. 

This  concept  extends  immediately  to  larger  cycles.  If  state  q2  is  the 
o-successor  of  q^  and  q^  is  the  a-»»uccessor  of  q2,  then  q^  is  its  own  act- 

successor,  and  there  is  a  cycle  in  the  machine.  If  we  split  q^  into  a  bi¬ 

partite  splitting  with  .a  items  only  in  the  predictive  state,  then  q2  will  be 
the  cr-succescor  of  the  initial  state,  and  the  cycle  no  longer  exists.  Thus 
we  have  a  conscious  and  deliberate  method  for  eliminating  a  cycle  by  per¬ 
forming  a  state-splitting.  Vz  do  not  claim  that  the  only  way  to  destroy  a 
cycle  is  to  split  one  of  its  states  in  such  a  way,  but  it  is  by  far  the  best 
and  most  direct  way,  and  we  shall  concentrate  our  attention  on  it. 

But  is  it  necessary  to  have  a  bipartite  splitting  of  q  ?  And  is  it  neces¬ 

sary  to  eliminate  all  .a  items  from  the  base  state  of  the  splitting?  The 
answer  to  both  these  question  is  no,  but  there  are  some  complications  in-. ol- 
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ved.  Let  us  consider  the  second  one  first.  Obviously,  we  do  not  need  to 
exclude  from  the  base  state  thoso  items  of  the  form  A  -»  a. o  (t),  because  they 
do  not  contribute  to  essential  items  of  q2-  But  even  among  the  items  of  the 
form  A  -»  a.  a  P(t),  some  ai  e  more  important  than  others,  because  there  may  be 
differences  among  the  essential  items  of  q2- 

Referring  again  to  the  simple  cycle  of  Figure  5.8,  we  remark  that  every 
essential  item  of  q2  is  obtained  from  some  . c^“item  of  q^  by  moving  the  dot. 
But  every  item  of  q^  is  a  descendant  of  some  essential  item  of  q  .  and  every 
essential  item  of  q^  is  obtained  by  moving  the  dot  on  some  .a2~item  of  q2« 

But  now  we  have  come  full  circle,  because  every  item  of  q2  is  a  descendant 
of  some  essential  item  of  q2*  In  other  words,  every  essential  item  of  q2 
causes  some  items  (its  descendants)  to  be  in  q?;  some  of  these  items  may 
cause  the  appearance  of  an  essential  item  i;»  q^;  and  the  process  continues. 
That  is,  each  essential  item  of  q^  causes  the  appearance  of  some  essentials  in 
q2,  and  vice  versa.  This  is  the  essence  of  there  being  a  cycle  consisting 
of  these  two  states:  the  essential  items  of  q^  cause  the  essential  items 
of  q2  to  be  in  the  o^-successor  of  q^,  while  these  same  essential  items  of 
q2  cause  the  e3sentials  of  q^  to  be  those  of  the  (^"successor  of  q2«  Thus 
the  essential  items  of  q^  cause  their  own  reappearance  in  tie  o^a -successor 
of  q^,  causing  there  to  be  a  cycle.  Let  us  consider  an  essential  item  1^  of 
q1  which  causes  the  appearance  of  essential  item  I2  in  02*  which  in  turn 
causes  the  appearance  of  1^  in  q^.  Then  1^  is  a  very  dangerous  item  indeed; 
it  is  enough  to  start  a  cycle  all  by  itself.  That  is,  any  state  that  has  1^ 
in  it,  will  also  have  1^  in  its  a^^^successor ;  and  since  there  are  only 
finitely  many  items  containing  1^,  thxs  means  there  is  going  to  be  a  cycle 
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somewhere,  unless  there  are  some  state-splittings  along  the  way.  Similar 
comments  apply  to  the  item  of  q Such  itsms  we  call  seed  items  of  the 
cycle,  because  they  contain  the  seed  of  a  cycle  in  and  of  themselves;  if 
such  an  item  Is  planted  in  any  state,  some  of  the  successors  of  that  state 
will  be  involved  in  a  cycle.  The  seed  items  of  qj  are  the  truly  critical 
essential  item?  of  q2»  which  at  all  cost3  must  not  appear  in  the  a^-successor 
of  the  base  3tate  of  a  splitting  of  q^,  if  we  desire  to  break  the  cycle  by 
splitting  q^. 

We  can  put  this  in  other  words  as  follows.  If  A  -*  Cta.  P(t)  is  a  seed 
item  of  q2»  then  we  call  A  -*  a.  o0(t)  a  vital  item  of  q^.  The  goal  of  split¬ 
ting  q^  is  to  replace  q,  by  a  splitting  such  that  no  vital  items  of  q^  ere 
in  the  base  of  the  splitting. 

However,  there  may  be  essential  items  of  q2  which  do  not  cause  their  own 
appearance  in  q2;  rather  they  just  happen  to  pop  up  in  q2  because  of  some  of 
the  other  essential  items.  The  disposition  and  fate  of  these  items  is  not 
so  critical;  no  harm  is  done  if  they  do  appear  in  the  a^-successor  of  the 
base  of  the  splitting  of  q^. 

As  an  illustration  of  these  concepts,  consider  the  single  state  of 
Figure  5.10,  which  is  a  cycle  all  by  itself.  There  are  two  essential  items 
in  this  state.  By  our  first  ideas,  in  order  to  break  this  cycle  it  would  be 


^  ^  A/a,  x 


A  _*  AA.x(x) 
A  -»  A.Ax(x) 
A  -*  .AAx(x) 
A  -»  .a(x) 


Figure  5. 10 
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necessary  to  split  this  state  so  that  neither  A  -»  A.Ax(x)  nor  A  -♦  .AAx(x) 
would  appear  in  the  base  of  the  splitting,  for  these  are  the  items  from 
which  the  essential  items  are  obtained.  But  chat  is  manifestly  impossible, 
since  A  A.Ax(x)  is  an  essential  item  and  must  be  present  in  the  base  state 
of  any  splitting.  Let  us  see  then  how  our  new  concepts  would  aid  us  here. 

The  essential  item  A  -*  A.Ax(x)  is  a  seed  item  of  this  cycle,  since 
A  -»  .AAx(x)  is  its  descendant,  and  A  -*  A.Ax(x)  can  be  obtained  from  this 
latter  Item  by  moving  the  dot  one  place.  However  the  item  A  -*  AA.x(x)  is 
not  a  seed  item,  since  it  is  not  obtained  from  any  of  its  descendants,  but  rather 
from  the  other  essential  item  (by  moving  the  dot).  Thus  A  ^  .AAx(x)  is  the 
only  vital  item,  the  only  item  that  must  not  appear  in  the  base  state  of  the 
splitting  for  us  to  successfully  break  this  cycle.  This  can  be  arranged  by 
a  simple  bipartite  state-splitting  of  the  state;  the  result  of  doing  this 
splitting  is  shown  in  Figure  5.11.  And  indeed  there  no  longer  is  a  cycle  on 

A 

We  do  note  that  the  base  state  is  not  the  A-successor  of  the  predictive 
state,  but  its  AA-successor ,  even  though  the  original  state  was  its  own  A- 
successor.  This  occurs  because  not  all  of  the  essential  items  were  seed 
items . 
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The  computation  of  seed  nodes  can  sometimes  be  a  little  more  subtle  than 
the  foregoing  indicates.  For  example,  in  the  state  of  Figure  5.12,  both  es¬ 
sential  items  are  seed  items,  though  it  may  initially  semi  more  as  though 
neither  one  is.  For  example,  B  -♦  .aAy  is  a  descendant  of  A  -*  a.Bx;  B  -»  a. Ay 
can  be  obtained  from  B  -»  .aAy  by  moving  the  dot;  A  -»  .aBx  is  a  descendant  of 


Figure  5.12 

B  _»  a. Ay;  and  we  finally  get  back  to  the  essential  item  we  started  with  by 
moving  the  dot  on  A  .aBx. 

We  now  define  these  notions  formally  in  order  to  demonstrate  some  use¬ 
ful  facts. 

Definition  5.22  Item  I *  is  an  ^-derivative  of  item  I  if  I*  is  a  descendant 
of  I;  l'  is  a  o-derivative  of  I  if  I  is  A  -»  c^.cx^  (t)  and  I  *  is  A  -»  a^.a^T). 
I*  is  an  a-derivative  of  I  if  there  is  a  sequence  of  items  I  =  Iq,  1^,  ..., 

I  *  l*,  such  that  Ii+1  is  a  ^-derivative  of  1^ ,  and  a  -  aQ  . . .  cr^. 

The  a-derivative  of  a  set  of  items  is  the  set  of  a-derivatives  of  all  i*-ems 
in  the  set. 

Toe  following  is  a  restatement  of  the  way  successors  are  computed  in 


LR(k)  machines. 


Preceding  page  blank 
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Lemma  5.23  If  q*  is  the  non-final  a-suces9or  of  q  in  the  LR(k)  machine  M, 
then  the  items  of  q *  are  the  set  of  a-derivatives  of  the  items  of  q. 

The  situation  is  not  so  simple  in  a  more  general  MSP(k)  machine,  be¬ 
cause  there  may  be  split  states,  and  the  a-derivatives  of  an  earlier  state 
may  be  scattered.  However  we  do  know  the  following. 

Lemma  5.24  If  q *  is  a  non-final  a-successor  of  q  in  an  MSP(k)  machine,  then 
the  items  of  q<  are  a  subset  of  the  a-derivative  of  the  items  of  q. 

In  particular,  this  means  that  every  essential  item  of  q *  is  an  a- 
derivative  of  some  essential  item  of  q. 

Now  suppose  we  had  a  cycle  in  an  MSP(k)  machine,  where  q  is  some  state 
in  the  cycle,  and  q  is  its  own  a-successor.  Then  every  essential  item  of  q 
is  an  a-derivative  of  some  essential  item  of  q. 

Definition  5.25  Let  q  be  a  state  of  an  MSP(k)  machine  such  that  q  is  its 
own  a-successor.  If  an  essential  item  of  q  is  its  own  an-derivative,  for 
some  n,  then  that  item  is  a  3eed  item  of  q  with  respect  to  the  cycle  on  a. 

Lemma  5. 26  If  q  is  its  own  a-successor  in  an  MSP(k)  machine,  then  there 
exist  some  seed  nodes  of  q  with  respect  to  the  cycle  on  a,  and  every  essential 
item  of  q  is  an  a^-derivative  of  some  seed  node,  for  some  i. 

Proof  Pick  any  essential  item  Iq  of  q.  Then  Iq  is  an  a-derivative  of  some 
essential  item  Tj  of  q;  similarly,  1^  is  an  a-derivative  of  some  ^  and  so 
on.  Thus  we  get  a  sequence  Iq,  1^,  •••  such  that  1^  is  an  a-derivative 

of  I^+j ,  and  all  are  essential  items  of  q.  Since  there  are  only  finitely 
many  essential  items  of  q,  there  must  be  some  repetition  in  this  sequence. 
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So  for  some  I.  ar.J  I...,  we  have  I.  =  I.,..  Then  I.  is  its  own  a^-deriva- 

i  i+j  i  i+j  i 

tive,  and  so  is  a  seed  item.  And  also  Iq,  which  was  arbitrary,  is  an  a*- 
derivative  of  1^.  Q.E.D. 

Lemma  5.27  If  B  -*  is  811  a-derivative  of  A  -*  then 

* 

a2  ^  a^2^  for  80me 

Proof  We  proceed  by  induction  on  the  length  of  a.  If  ja|  «  0,  then  B 
is  a  descendant  of  A  -*  a^'a2(T^‘  and  the  result  is  inmediate.  Suppose  the 
statement  is  true  for  |aj  =  n;  then  consider  the  case  where  )a|  =  n+1.  Sun- 
pose  B  -*  *3  an  essential  item.  Then  p^  =  p^c,  where  a  =  cpcr.  Then 

B  p^.oP2(w)  is  a  (^-derivative  of  A  -»  a^.a2(T),  where  |cpj  =  n.  Therefore 

*  * 
a2  =»  cpap2^  ^y  induction;  since  a  *  q 30,  this  giver  us  ^  ^  tpp2^  as  required 

If  B  -*  .p2(w)  is  not  essential,  then  it  is  the  descendant  of  some  essential 
item  which  in  an  a-derivative  of  A  -*  a^.r2(T)>  and  the  result  follows  immedi¬ 
ately.  Q.E.D. 

Theorem  5.28  If  A  -*  a^.a2(T)  is  a  jeed  item  of  q  with  respect  to  the  cycle  on 
*  i 

a,  then  a2  =»  (a)  a2y  for  some  V  and  i. 

Proof  If  A  -*  a^.a2(T)  is  a  seed  item,  then  it  is  its  own  a^-derivative  for 
some  i.  The  result  then  follows  from  the  preceding  lemma.  Q.E.D. 

We  shall  find  this  result  very  useful  in  determining  good  strategies  for 
breaking  a  cycle  by  state-splitting. 

Definition  5.29  Suppose  q  '  is  a  crsuccesnor  of  q  and  q  is  an  a-successor  of 
q  * ,  in  the  MSP(k)  machine  M.  If  A  ^  a^-a^T)  is  a  seed  item  of  q'  with  re- 
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spect  to  the  cycle  on  a  a,  then  A  -»  a^.oa2(T)  1°  3  vital  Item  of  q  with  respect 
to  tho  cycle. 

With  these  ideas  in  hand,  we  proceed  to  consider  the  Issues  r.f  s*"ate- 
splittings  which  keep  vital  items  out  of  the  base  state. 

To  return  to  a  question  we  asked  earlier,  is  It  necessary  to  perform  a 
bipartite  splitting  of  a  state  in  a  cycle  in  order  to  break  the  cycle?  Intui¬ 
tively,  we  would  guess  that  the  answer  should  be  no,  that  the  only  important 
issue  is  to  somehow  keep  the  vital  items  of  the  state  out  of  the  base  of  the 
splitting,  vith  the  number  of  initial  states  being  irrelevant.  This  is  the 
case,  but  this  fact  is  not  very  important,  since  in  any  conceivable  case  a 
bipartite  splitting  will  suffice.  Or  put  another  way,  under  any  realistic 
circumstances,  there  can  be  no  worthwhile  multi-partite  splitting.  Suppose 
q  is  its  own  oa-successor ,  and  suppose  there  is  a  splitting  of  q  with  vital 
items  A,  .oa  (t,)  and  A„  -♦  .ctcl,(t„)  in  differant  initial  scates.  Then  by 
the  laws  of  legal  state-splitting,  FIRST^(o  a^T^)  f|  FIRST^(a  =  0.  But 

by  the  definition  of  vital  item,  A^  -»  o.a^(T^)  and  A^  *♦  o.a^(r^)  will  be  seed 

*  I  *  1 

items;  therefore,  (aa)  and  a 2  =*  (aa)  for  some  i,  j,  Y^,  and 

*  m  * 

Y2>  by  Theor*au  5.28.  Therefore,  for  any  ru  at  all,  Ct^  =>  (a a)  cp^  and  a2  =» 
(acr)111  cp2.  Thus  unless  the  only  string  a  a  generates  is  g,  FIRSf^  (a  a^)  will 
intersect  with  FIRST^  (a  a2).  It  is  safe  to  say  that  in  any  practical  situa¬ 
tion  the  label  of  a  cycle  in  an  MSP(k)  machine  will  not  be  a  sequence  of  sym¬ 
bols  that  generates  only  f : .  So  in  any  interesting  situation,  there  can  not 
be  any  multi-partite  splitting  of  a  state  such  that  vital  items  occur  in  dif¬ 
ferent  initial  states  of  the  splitting;  thus  there  seems  to  be  no  practical 
use  for  multi-partite  splictings,  and  we  can  restrict  our  attention  to  bi- 
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partite  splittings  for  the  time  being. 

It  is  not  a  difficult  matter  to  compute  the  seed  items  of  a  .’tate  with 
respect  to  a  given  cycle,  and  hence  we  can  obtain  the  vital  items  of  any  state 
in  a  cycle.  Bu>.  such  a  computation  will  frequently  be  unnecessary.  Hie  vi¬ 
tal  items  of  a  state  in  a  cycle  form  the  minimum  set  of  items  rhic'n  must  go 
into  the  predictive  state  of  a  splitting  in  ore'e"  to  destroy  the  cycle.  If 
q  is  its  own  ex  a-successor,  then  the  vital  items  of  q  are  all 
critema.  It  would  be  reasonable  to  first  attempt  to  find  a  splitting  of 
q  such  tilt  all  .  o-items  are  in  the  predictive  state;  if  no  such  splitting 
can  be  f'  nd,  then  we  could  actually  ccmpjite  precisely  which  of  the  .o-items 

sre  vital,  and  try  to  find  some  splitting  such  that  at  least  these  items  are 

in  the  predictive  stace.  Very  often,  if  there  is  a  splitting  which  puts 
th_  vital  items  in  the  initial  state,  then  it  also  places  the  other  .cx-items 
there  as  well.  This  will  be  the  case  whenever  c  generates  a  string  longer 
than  k,  as  is  easy  to  demonstrate. 

How  do  we  go  about  finding  a  useful  splitting  of  a  state  in  a  cycle? 

First  of  all,  wa  determine  the  vital  items  of  the  cycle  in  that  state  (or 
whatever  set  of  items  we  want  to  keep  out  of  the  base  state),  fhen  we  want 
to  find  a  splitting  of  the  state  that  leaves  all  these  i.ews  out  of  the  base 
of  the  splitting.  We  recall  that  a  splitting  is  defined  by  the  functions 
and  H2  on  the  simple  chains  through  the  state;  sc  our  next  step  is  to  form 

all  these  chains.  We  isolate  those  simple  chains  th  t  contain  a  vital  item 

—  we  call  these  vital  chains.  We  must  find  a  splitting  which  breaks  every 
vital  chain  "above"  all  the  .  rtal  items  in  it,  anu  hre  ks  the  other  simple 
chains  in  any  way  that  makes  the  splitting  legal.  As  we  have  seen.  Cor  use- 


261 


ful  cases  we  need  concern  ourselves  only  with  bipartite  splitting,  so  the 
only  candidates  for  the  predictive  nonterminal  of  the  ini  ;ial  state  of  the 
splitting  are  those  that  appear  on  the  left-hand  side  of  an  item  above  every 
vital  item  in  any  vital  chain.  We  could  select  each  possible  candidate,  try 
all  its  sLate-splittings ,  and  see  if  any  of  them  are  both  legal  and  put  all 
vital  items  into  the  predictive  state.  However,  we  can  restrict  this  inef¬ 
ficient  procedure  and  give  it  some  guidance  in  the  following  way. 

For  any  vital  item  I,  consider  FIRST^(I).  We  know  that,  in  order  for  I 
to  he  in  the  predictive  state  and  not  in  the  base  state,  any  lookahead  from 
FIRST^(.)  will  have  to  occasion  the  prediction  to  be  made.  So  the  union  of 
FIRST^d),  over  all  vital  items  I,  will  have  to  be  contained  in  the  predictive 
language  that  occasions  the  prediction  to  be  made.  We  can  define  the  language 
of  a  chain  as  being  FIRST^  cf  the  terminal  item  at  the  end  of  a  chain.  We 
then  perform  the  following  procedure: 

1)  Compute  the  language  of  every  vital  chain. 

2)  Designate  any  other  chain  whose  language  intersects  the 
language  of  some  vital  chain  as  being  vital  as  well. 

3)  Iterate  this  process  until  no  new  vital  chains  are  found. 

Ihen  we  have  a  class  of  vital  chains,  some  of  which  contain  vital  items 

and  some  which  do  not. 

The  reason  for  performing  this  computation  is  simple.  It  is  possible  to 
compute  the  languages  of  the  vital  items  of  a  state  by  taking  the  union  of  the 
languages  of  all  chains  in  which  some  vital  item  appears.  It  is  essential  that 
a  prediction  be  made  whenever  one  of  these  lookaheads  is  espied  upon  entry  to 
the  base  state,  if  the  vital  items  are  to  be  in  the  predictive  state.  However, 


if  some  string,  which  might  be  a  lookahead  of  a  vital  item,  is  also  in  the 
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language  of  some  other  chain,  then  for  whatever  nonterminal  A  is  being  predicted, 
there  had  better  be  a  .A  item  in  that  non-vital  chain  as  well.  So  all  strings  in  the 
language  of  that  other  chain  will  occasion  a  prediction  to  be  made,  so  that 
chain  may  as  well  be  considered  vital;  and  thr  computation  continues.  The 
result  is  a  set  of  chains  such  that  any  string  in  the  language  of  any  of 
these  chains  will  cause  a  prediction  to  be  made.  Thus  the  first  step  in  com¬ 
puting  a  splitting  is  to  find  some  nonterminal  A  such  that  there  is  a  .A  item 
in  each  of  these  chains,  above  any  vital  items  that  may  be  in  a  chain.  The 
plan  is  to  find  a  distinguished  set  of  .A-items;  the  A's  in  these  items  will 
be  the  predicted  nonterminal.  The  idea  is  that  these  distinguished  .A-items 
will  serve  to  define  and  H ^  on  the  vital  chains.  That  is,  of  a  chain 
will  be  all  of  the  chain  from  the  top  through  the  selected  .A-item,  while  Hj 
will  be  the  rest  of  the  chain. 

Locating  a  nonterminal  A  which  ha.  some  .A-items  in  the  appropriate  places 
in  the  vital  chains,  is  not  by  itself  the  end  of  the  story.  First  of  all,  if 
A  is  left  recursive  there  may  be  several  .A-items  to  choose  from  in  some  vital 
chains.  The  choice  of  certain  of  these  items  as  the  place  to  break  the  chain 
may  violate  the  condition  ot  disjointness  of  the  follow  sets  of  A  in  the  base 
and  predictive  states.  Clearly,  caution  must  be  exercised  in  such  a  case. 
Furthermore,  even  if  A  is  not  left  recursive,  the  chosen  .A-items  may  occur  in 
other,  non-vitul  chains.  Then  t!  e  strings  in  the  languages  of  these  chains 
will  also  have  to  be  in  the  predictive  language;  therefore  any  chain  which 
has  one  of  these  strings  in  its  language  will  have  to  contain  one  of  the  dis¬ 
tinguished  .A-items.  The  functions  and  Hg  will  be  defined  on  these  chains 
in  the  usual  way:  generally,  it  is  best  to  break  a  chain  at  the  lowest  pos- 
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sible  .A-item,  so  as  to  minimize  the  possibilities  of  a  follow  set  conflict. 

As  an  example  of  how  this  procedure  is  applied,  consider  the  LR(2)  state 
of  Figure  5.13;  this  state  is  its  own  a-suceessor.  Inspection  shows  us  that 
the  only  s*ed  item  of  this  state  is  the  item  A  a.B(bb),  and  so  the  only 


A  -»  a.B(bb) 

A  -»  a.cd(bb) 
D  -*  a.b(bb) 

D  -*  a.cc  (bb) 
D  -♦  a.E(bb) 

B  -*  .cdd(bb) 
B  4  . Db (bb ) 

D  _»  .  ab  (bb ) 

D  -»  .acc  (bb) 
D  -»  .  aE(bb) 

D  ^  .be (ab) 

D  -♦  .  Ab  (bb ) 

A  -*  .  aB(bb) 

A  -»  .acd(bb) 
E  +  .bd(bb) 

E  -*  .cdd(b'o) 


Figure  5. 13 

vital  item  is  A  -*  .aB(bb);  we  want  to  find  a  bipartite  splitting  of  this 
state  so  that  A  -*  .aB(bb)  is  not  in  the  base  state. 

The  chains  through  this  state  are  given  ir  Figure  5.14.  Each  chain  is 


organized  vertically  and  is  numbered  for  convenience; 
chain  is  given  beneath  it  in  braces. 


the  language  of  each 
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2  3 


A  -*  a,cd(bb> 

D  -a  a.b(bb) 

D  -*  a.cc(bb) 

fed] 

{bb) 

fee) 

4 

5 

6 

D  -* 

a.E(bb) 

D  -♦ 

a.E(bb) 

A  -♦  a.B(bb) 

E  -♦ 

.bd(bb) 

E  -* 

•  edd  (bb) 

B  -*  .  cdd(bb) 

(bdl 

(cd] 

fed] 

7 

8 

9 

A  -» 

a.B(bb) 

A  -♦ 

a.B(bb) 

A  -♦  a.B(bb) 

B  -♦ 

.Db (bb) 

B  -♦ 

•Db (bb) 

B  -♦  .Dfc(bb) 

D  -♦ 

. ab (bb ) 

D  -* 

•  acc  (bb) 

D  -♦  .  aE(bb) 

(ab] 

(ac] 

( ab ,  ac  ] 

10 

11 

12 

A  -♦ 

a.B(bb) 

V  -» 

a.B(bb) 

A  -♦  a.B(bb) 

B  -♦ 

. Db (bb ) 

B 

.Db(bb) 

B  -♦  .  Db  (bb  ) 

0  -♦ 

.  Ab  (bb  ) 

D  -* 

.Ab (bb) 

D  4  .be (bb) 

A  -♦ 

.aB(bb) 

A  -* 

.acd(bb) 

(aa,  ac1 

(ac] 

(be] 

Figure  5.14 


As  we  have  said,  A  -♦  .aB(bb)  is  the  only  vital  item;  since  the  only 
chain  in  which  it  appears  is  chain  10,  chain  10  is  the  only  vital  chain 
we  begin  with.  Thus  the  first  version  of  the 

predictive  language  is  (aa,  ac].  The  string  ac  is  in  the  language  of  chains 
8,  9,  and  11  as  well,  so  they  too  are  vital  chains.  Furthermore,  since  ab 
is  in  the  language  of  chain  9,  it  is  also  part  of  the  predictive  language,  and  this 
makes  chain  7  into  a  vital  chain.  Therefore  chains  7-11  are  vital;  and 
we  must  find  some  a  such  that  there  are  .a  items  in  each  of  these  chains, 
which  is  above  the  vital  item  in  chain  10.  By  inspection,  we  see  that  there 
are  two  possibilities  for  such  a  c,  namely  B  and  D.  The  item  A  -♦  a.B(bb)  is 
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in  each  of  these  chains,  as  is  the  item  B  -*  .Db(bb).  However,  there  is  one 
more  step  to  check;  namely,  we  must  determine  whether  the  other  lookaheads 
of  these  items  can  safely  occasion  their  prediction.  For  the  case  a  *  B, 
this  fails  to  be  the  case.  The  item  A  -*  a.B(bb)  appears  in  chain  6,  and  cd 
is  in  the  language  of  chain  6;  but  cd  is  also  in  the  languages  of  chains  1 
arid  5,  and  A  -♦  a.B(bb)  appears  in  neither  of  them.  Things  work  out  better 
for  predicting  a  D.  The  only  non-vital  chain  in  which  B  -»  .Db(bb)  appears 
is  chain  12;  the  language  of  chain  12  is  (be),  and  be  occurs  in  the  language 
of  no  other  chain.  Hence  the  D  in  B  ^  .Db(bb)  can  always  be  safely  predicted 
upon  sighting  one  of  its  lookaheads.  The  splitting  induced  by  this  predic¬ 
tion  is  given  in  Figure  5.15;  the  cycle  has  indeed  been  broken,  and  the 
base  state  is  the  a-successor  of  the  predictive  state. 


A  -»  a  R(bb) 
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A  _»  a.cd(bb) 

D  -*  .ab(bb) 

D  a.b(bb) 
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D  _»  a.cc(bb) 

- - 

D  .aE(bb) 

D  -♦  a.E(bb) 

D  _»  .be  (bb) 

B  -4  .cdd(bb) 

D  .  Ab  (bb  ) 

B  -♦  .Db(bb) 

A  -*  .a  (bb) 

E  -»  .bd(bb) 

A  -*  .acd(bb) 

E  -4  .cdd(bb) 

Figure  5.  15 

9f  course  there  are  any  number  of  places  where  this  procedure  we  have  been 
describing  may  fail.  First  of  all,  there  may  be  no  predictable  nonterminal 
which  induces  a  splitting  in  which  all  vital  items  are  excluded  from  the  base 
of  tne  splitting.  That  is,  there  might  not  be  any  nonterminal  A,  such  that 
there  is  a  .A-ltem  in  the  appropriate  place  in  each  vital  chain.  Then  even  if 
there  is  such  an  A,  the  wrong  choice  of  distinguis'  id  .A-items  may  have  been 
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made.  That  Is,  if  A  is  left  recursive,  FOLLOW^(B,  A)  0  FOLLOW^  (?,  /) 

■ay  not  be  empty,  if  B  and  P  are  defined  as  the  choice  of  .A-items  would 
dictate.  This  might  necessitate  going  back  and  choosing  other  distinguished 
.A-items  to  define  and  Hj,  and  seeing  whether  or  not  any  problems  result. 

It  might  be  necessary  to  increase  the  number  of  distinguished  .A-items  over 
those  that  are  absolutely  necessary  to  keep  the  vital  items  out,  in  order  to 
avoid  dis jointness  problems.  But  these  are  fine  tuning  issues  with  respect 
to  the  choice  of  a  particular  A  as  the  predictive  nonterminal.  It  always 
suffices  just  to  compute  all  possible  legal  bipartite  splittings  where  A 
is  the  predictive  nonterminal,  and  see  whether  any  of  them  have  the  vital 
items  in  the  predictive  state.  Of  course,  there  may  be  no  such  legal  split¬ 
ting  based  on  an  A,  so  we  might  have  to  backtrack  to  find  another  suitable 
predictive  nonterminal  and  some  splitting  bas^d  on  it.  And  in  fact  there  may 
not  be  any  splitting  at  all  of  the  state  in  question  which  has  the  desired 
structure;  exhaustion  of  the  list  of  all  possible  predictive  nonterminals 
might  show  that  no  splitting  of  this  state  could  break  the  cycle.  Then  we 
could  attempt  to  break  the  cycle  by  splitting  some  other  of  its  states,  and 
apply  this  procedure  all  over  again  to  that  state. 

The  foregoing  discussion  and  algorithms  provide  us  with  a  structured 
approach  to  the  problem  of  breaxing  a  particular  cycle  in  an  MSP(k)  machine 
by  splitting  one  of  its  states.  Mow  we  have  made  no  formal  claim  that  if 
there  is  some  way  to  eliminate  a  cycle  from  an  MSP(k)  machine,  then  the  fore¬ 
going  , rocedure  will  succeed  and  find  a  state  in  the  cycle  which  can  be 
split  in  a  bipartite  splitting,  with  all  viral  items  in  the  predictive  state. 
The  possibilities  for  the  effect  of  a  sequence  cf  arbitrary  state-splittings 
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on  a  machine  are  too  bizarre  to  allow  any  such  wide-sweeping  generalization. 

That  is,  it  may  happen  that  by  splitting  some  other  state  which  dominates  all 
the  states  in  the  cycle,  to  eliminate  from  the  machine  the  exact  states  which 
form  the  cycle  in  question.  But  we  do  not  really  need  such  a  general  assurance, 
for  we  are  not  looking  for  a  decision  procedure;  we  already  know  that  the 
class  of  k-transformable  grammars  is  decidable.  Rather  we  are  interested  in 
heuristics,  in  good  procedures  that  work  very  often,  and  especially  in  most 
useful  cases. 

The  obvious  next  step  then,  is  to  go  from  the  problem  of  eliminating  any 
single  cycle  from  an  MSP(k)  machine,  to  eliminating  all  the  cycles  from  en 
MSP(k)  machine.  We  are  especially  interested  in  one  particular  version  of 
the  latter  task,  namely  where  the  MSP(V)  machine  is  the  LR(k)  machine  for 
the  grananar  we  start  out  with.  That,  after  all,  is  what  we're  really  trying 
to  do:  eliminate  all  the  cycles  from  the  LR(k">  machine. 

In  the  foregoing,  we  have  examined  the  problem  of  eliminating  a  single 
cycle  from  an  MSP(k)  machine  by  splitting  one  of  the  states  in  the  cycle. 

Can  we  similarly  propose  strategies  for  removing  a  number  of  cycles  from  a 
machine  by  sequentially  splitting  a  number  of  states?  Is  there  more  to  break¬ 
ing  a  number  of  cycles  than  just  breaking  each  one  individually? 

In  general,  the  answer  to  this  last  question  is  no,  there  is  not  any  more 
to  eliminating  a  set  of  cycles  than  just  eliminating  each  one  of  them;  but 
there  are  a  few  possible  con’.1. 11  cations  which  we  shall  discuss  briefly. 

First  of  all,  we  know  that  by  splitting  a  state  in  a  cycle  so  that  all 
of  its  vital  items  with  respect  to  that  cycle  are  to  be  found  only  in  the  pre¬ 
dictive  state(s)  of  the  splitting,  we  d"  indeed  remove  that  particular  cycle 
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from  the  machine.  That  is,  if  q  is  its  own  a-successor,  and  we  replace  q  by 
a  bipartite  splitting  with  all  the  vital  items  in  the  predictive  state  P,  then 
the  base  state  B  will  be  the  a^-successor  of  P,  and  the  cycle  is  gone.  But  is 
it  possible  for  other  cycles  in  the  machine  to  be  affected  by  this  state¬ 
splitting  in  such  a  way  that  although  they  may  have  formerly  been  elimin&ble, 
they  no  longer  are?  Or  can  new  cycles  be  introduced  into  the  machine  as  a 
re’Uilt  of  the  state-splitting,  new  cycles  which  cannot  be  removed  by  state- 
up  litti.ig? 

Offhand,  it  appears  that  the  first  of  these  questions  is  not  a  very  sen¬ 
sible  one.  It  seems  tnat  splitting  a  state  of  one  cycle  can  have  only  very 
limited  effects  on  any  other  cycle.  Either  some  state  of  the  other  cycle  is 
dominated  by  the  state  being  split  and  disappears  after  it  is  split,  causing 
the  other  cycle  to  vanish  along  with  it;  or  else  the  other  cycle  will  not  be 
effected  by  the  state-aplittiig .  However,  upon  reflection  we  see  that  there 
is  one  circumstance  in  which  breaking  one  cycle  by  splitting  one  of  its  states 
can  adversely  affv „t  the  prospects  for  breaking  another  cycle:  and  that  is 
where  the  state  being  split  to  break  one  cycle  is  also  ..  state  of  the  other 
cycle.  Then  it  might  be  that  by  splitting  the  state  in  question  so  as  to 
break  the  first  cycle,  the  vital  items  of  the  ocher  cycle  will  be  put  in  the 
base-  state  of  the  splitting;  and  if  no  other  states  of  this  second  cycle  can 
be  broken  in  a  useful  way,  then  there  will  be  no  way  to  break  the  cycle,  even 
though  some  other  splitting  of  the  state  in  question  might  have  broken  the 
other  cycle. 

For  example,  consider  the  LR(0)  states  of  Figure  5.16;  there  are  two 
cycles  in  this  diagram,  one  on  a  and  the  other  on  Aa.  If  we  split  the  state 
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on  the  1  -fit  by  predicting  ar  A,  we  can  break  one  of  these  cycles  as  shown 


•0 


A  -»  a.  B 

A  \ 

B  -♦  A.C 

B  -*  .AC 

- > 

B  -♦  A.d 

B  -♦  .Ad 

C  -*  .Ab 

A  -»  .b 

<  a 

A  -*  .b 

A  -»  .aB 

A  -4  .  aB 

Figure  5. 16 


in  Figure  5.17.  But  as  we  see  from  inspection  of  the  result  of  this  state¬ 
splitting,  it  is  now  impossible  to  break  the  cycle  on  Aa,  because  one  of  the 
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states  in  the  cycle  is  abase  state  which  cannot  be  split;  while  there  are  no 
nonterminals  that  can  be  predicted  in  the  other  state,  and  hence  that  can'i 
be  split  either.  However,  had  we  made  the  original  split  by  predicting  B 
rather  than  an  A,  we  would  have  the  configuration  of  Figure  5. 18,  in  which 
both  cycles  have  been  broken.  The  moral  of  this  story  is  that  in  splitting 
a  state  through  which  several  cycles  pass,  we  must  exercise  caution  in  choos¬ 
ing  the  splitting,  so  as  to  break  all  cycles  if  possible.  Furthermore,  it 
may  happen  that  while  two  cycles  in  a  given  machine  can  each  individually  be 
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Figure  5.18 


broken  by  splitting  the  same  state  of  the  machine,  two  different  splittings 
are  used  to  brtnk  the  two  different  cycles.  But  it  is  not  hard  to  see  that 
except  in  the  most  contrived  circumstances',  if  there  is  one  bipartite  split¬ 
ting  of  q  which  excludes  all  of  one  set  of  vital  items  from  the  base, and  another 
bipartite  splitting  of  q  which  excludes  all  of  another  set  of  vital  items  from  its  base, 
then  there  is  some  splitting  of  q  (possibly  multi-partite)  which  excludes 
both  sets  of  vital  items  from  the  base.  (This  is  one  case  where  bipartite 
splittings  may  not  suffice  and  the  general  theory  is  needed.)  Thus  we  do 
hrve  one  caveat  to  heed  while  proceeding  to  break  a  set  of  cycles  by  splitting 
a  state  from  each,  one  at  a  time:  namely,  to  be  on  the  lookout  for  states 
which  can  be  split  so  as  to  break  several  cycles.  Pursuance  of  this  course 
will  also  help  us  avoid  a  potential  pitfall.  It  might  happen,  if  wa  are  not 
careful,  that  all  the  states  in  some  cycle  are  split  (so  as  to  break  other 
cycles),  but  the  cycle  is  left  intact,  consisting  of  the  base  states  of  these  split- 
-tings.  Caution  with  respect  to  the  issues  just  discussed  can  forestall  tnis  occurrence. 

The  question  of  the  introduction  of  new  cycles  as  the  result 
of  performing  a  state-splitting  seems  to  be  a  secious  difficulty,  but  a  little 
reflection  shows  it  not  to  be  a  dangerous  problem.  First  of  all,  suppose 
that  a  splitting  obeys  the  following  property:  if  some  .a-item  is  in  a  com¬ 
ponent  of  the  splitting,  then  all  .o-items  are.  If  a  state  in  a  machine  is 
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replaced  by  such  a  splitting,  then  no  new  states  (other  than  the  components 
themselves)  are  introduced  into  the  machine  as  a  result  of  the  splitting; 
the  successors  of  the  comporents  are  the  same  as  the  successors  of  the  state 
they  are  replacing.  A  great  many  state-splittings  do  obey  this  condition; 

sin< replacing  a  state  by  such  a  splitting  doe9  not  add  any  new  states  to 

the  machine,  it  cannot  cause  any  new  cycles  either.  But  even  if  this  happy 
state  of  affairs  does  not  obtain,  and  we  use  a  less  benign  splitting,  the 
situation  of  the  machine  will  not  be  degraded  by  performing  the  state- 
splitting.  To  be  more  precise,  we  make  the  following  claim:  supnose  M * 
is  obtained  from  M  by  splitting  some  state  of  M;  then  if  every  cycle  of  M 
can  be  broken,  so  can  e/e ry  cycle  of  M*  Lj  broken.  In  other  words,  though 
the  performance  of  a  state-splitting  may  introduce  new  cycles  into  a  machine, 
these  cycles  will  not  t,?  of  a  new  and  difficult  character,  and  they  by  them¬ 
selves  will  not  prevent  us  from  realizing  oar  goal  of  a  cycle-free  machine; 
one  of  the  new  cycles  will  be  impossible  to  remove  only  if  some  old  cycle 
was  also  unbreakable. 

The  argument  for  our  claim  is  as  follows.  Suppose  there  is  a  new  un¬ 
breakable  cycle  in  the  machine  after  a  state-splitting  has  been  done.  Let 
q  be  some  state  of  this  cycle,  where  q  is  its  jwn  ct-successor.  Then  q  is  a 
substate  of  some  state  q  * ,  which  was  in  tne  machine  before  the  splitting  was 
done.  For  any  cp,  the  cp-successor  of  q  in  its  cycle  is  contained  in  some  cp- 

successor  cf  q  .  In  particular,  q  is  contained  in  some  a^-successor  of  q  , 

i  ,  j 

for  each  i.  Thi’  means  that  some  a  -successor  of  q  is  its  own  a  -successor. 
Thus  there  is  some  cycle  of  successors  of  q where  each  state  in  this  cycle 
contains  some  3tate  in  the  cycle  in  which  q  is  involved.  If  some  state  in 
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this  former  cycle  could  be  split  so  as  to  break  the  cycle,  the  same  would  be 
true  for  the  state  it  contains  in  the  latter  cycle.  So  if  the  latter  cycle  is 
unbreakable,  so  is  the  former,  and  our  claim  is  established.  It  is  easy  to 
formalize  the  foregoing  discussion  and  make  this  assertion  rigorous.  The 
upshot  of  this  argument  is  that  no  disasters  will  occur  as  the  result  of  a 
state-splitting:  no  new  unbreakable  cycles  can  crop  up  to  haunt  us. 

Thus  a  simplistic  approach  to  the  problem  of  ridding  a  machine  of  all 
its  loops  turns  out  to  be  relatively  attractive.  In  summary,  the  strategy 
is  to  pick  any  cycle  and  split  one  of  its  states  30  that  no  vital  items  are 
in  the  base  of  the  splitting;  this  will  break  that  particular  cycle.  Then 
we  pick  some  other  cycle  in  the  new  machine  and  apply  this  process  again; 
and  continue  iterating  this  procedure  as  long  as  necessary  —  always  watching 
out  for  states  that  can  break  two  cycles  at  the  same  time.  If  the  result  of 
this  process  is  a  cycle-free  machine,  then  we  can  apply  the  grammatical  trans¬ 
formation  of  Chapter  4  and  construct  the  desired  L.L(k)  grammar.  We  stress  that 
this  approach  will  not  necessarily  work  for  the  full  class  of  k-transformable 
granmars,  but  that  it  will  be  useful  for  any  reasonable  grammar  that  is  not 
laden  with  a  variety  of  peculiar  features.  If  this  systematic  approach  does 
not  succeed  in  constructing  a  cycle-free  machine  for  a  given  grammar,  we  can 
always  resort  to  the  haphazard  procedure  of  constructing  all  possible  MSP(k) 
machines  for  the  grammar  and  seeing  if  any  one  of  them  is  cycle- free. 

In  particular,  there  is  one  large  class  of  grammars,  a  subset  of  the  k- 
transformable  gr  mmars ,  for  which  these  techniques  will  work,  and  the  out¬ 
standing  feature  of  this  class  is  that  u.  ■  decision  procedure  is  much  bef  er 
than  the  general  one  for  k-transformable  grammars.  We  can  define  this  class 
as  those  grammars  whose  LR(k)  machines  satisfy  the  following  property:  there 


is  a  set  of  states  of  the  machine  and  a  splitting  of  each  state,  in  the  set  such  that 
for  any  cycle  in  the  machine,  one  of  these  states  is  in  the  cycle,  and  the 
splitting  of  that  state  breaks  the  cycle.  (This  is  almost  the  same  as  the 
siTnler  statement  that  each  cycle  in  the  machine  can  be  broken  by  splitting 
one  of  its  states.  There  is  a  difference  of  quant  'cation  between  the  two 
statements,  beoause  in  the  second  statement  the  sane  state  might  be  split  in 
different  ways  in  order  to  break  different  cycles,  while  this  cannot  be 
the  case  in  the  more  precise  definition  we  are  uring.)  By  our  previous  dis¬ 
cussion,  it  will  be  possible  to  construct  a  cycle-free  MSP(k)  machine  for 
one  of  these  grammars.  The  construction  procedure  will  consist  of  breaking 
any  cycle  in  the  machine  by  splitting  the  appropriate  state,  and  computing 
the  resultant  machine.  Any  cycle  in  this  new  machine  will  be  breakable  be¬ 
cause  any  cycle  in  the  original  machine  was ;  the  splittings  of  states  in  the  origi¬ 
nal  machine  can  serve  as  models  for  splitting  states  in  the  new  machine.  So  we 
choose  some  cycle  and  break  it  by  i  state-splitting  as  before.  Again,  any  cycle  in  this 
resulting  machine  wiJ  I  be  breakable,  and  so  we  can  continue  on  in  this  way,  until 
we  have  eliminated  all  the  cycle'-,  and  -.re  left  with  a  cycle 'free  machine.  Though 
tedious  to  do,  all  of  this  can  be  made  rigorous. 

It  is  apparent  that  it  is  relatively  straightforward  to  determine  if  a 
grammar  satisfies  the  above  condition;  it  is  only  necessary  to  see  if  each 
cycle  in  the  LR(k)  machine  ca.  be  broken  by  a  state-splitting.  This  is  at 
least  a  deterministic  problem  which  we  can  attack  in  a  structured  way,  as 
opposed  to  the  random  decision  procedure  we  have  for  the  general  class  of 
k- transformable  grammars.  Furthermore,  this  test  Is  one  that  can  be  applied 
directly  to  the  LR(k)  machine  for  the  grammar;  we  do  not  need  to  construct 
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a  whoie  series  of  auxiliary  machines  just  to  determine  if  the  grammar  inter¬ 
ests  us  or  not. 

With  tne  preceding  discussion  we  nave  achieved  at  l«_ast  a  ?timited  ver¬ 
sion  of  our  goal  of  rationalizing  the  process  of  constructing  cycle-free  ma¬ 
chines  and  bringing  the  '•rans  format  ion  procedure  into  the  real/a  of  practical¬ 
ity.  Now  we  wish  t:c  previa*,  some  additional  suggestions,  some  s_,.  d-order 
heuristics,  to  guide  the  program  or  person  going  through  the  procedure  of 
eliminating  the  cvcles  from  an  .R(k)  machine  by  repeated  state-splittings 
These  observations  are  intended  to  assist  him  in  producing  a  cycle-free  ma¬ 
chine  which  is  attractive  when  compared  with  other  cycle-free  machines  that 
he  might  otherwise  create,  with  respect  to  features  that  influence  t^e  char¬ 
acter  of  the  derived  grammar. 

First  of  all,  we  know  that  the  size  of  the  derived  grammar  is  heavily 
influenced  by  tne  number  of  predictivt  states  in  the  machine  from  which  it  is 
derived.  So  an  effort  should  be  mai  :  to  keep  down  the  number  of  state  split¬ 
tings  performed  aid  the  number  of  predictive  states  created.  Whenever  pos¬ 
sible,  a  bipartite  splitting  is  to  be  preferred  to  n  multipartite  ono-  and 
it  becomes  even  more  attractive  to  break  several  cycles  with  one  splitting, 
whenever  such  a  feat  is  possible.  Furthermore,  it  is  recommended  to  try  to 
introduce  as  few  as  possible  new  states  into  the  machine  as  a  result  of  a 
? -ate-splitting;  this  can  be  best  effected  by  choosing  a  splitting  (whenever 
possible)  which  has  all  .0  items  in  the  same  component  of  the  splitting,  for 
each  0.  If  this  is  done,  the  successors  of  the  components  will  be  states 
already  in  the  machine.  Tills  minimization  of  new  states  is  a  desirable  goal 
because  it  forestalls  the  creation  of  new  cycles,  which,  though  related  to 


275 


existing  cycles  and  hence  breakable,  will  nonetheless  require  additional 
state-splittings  to  accomplish  this  breaking.  In  general,  if  there  are 
several  bipartite  splitting;  of  a  given  state  that  will  suffice  to  break  a 
cycle  in  which  that  state  occurs,  the  preferred  splitting  is  that  which 
places  the  fewest  items  from  the  state  in  the  base  of  th  splitting.  Pur¬ 
suance  of  this  strategy,  particularly  in  conjunction  with  the  preceding  re- 
commendations,  will  have  a  beneficial  effect  on  the  size  of  the  grammar  de¬ 
rived  from  the  constructed  cycle- free  machine.  The  reasoning  behind  this  is 
as  follows.  Suppose  q  is  a  state  uh?th  is  its  own.  a-successor,  and  which 
can  be  split  in  two  ways,  each  of  which  p;ts  all  the  vital  items  into  the 
predictive  state;  but  suppose  in  the  first  splitting  only ,  all  .  o  items  are  in  the 
jredictiv?  ^tsfe.  Now  imagine  two  cycle-frce  machines,  derived  frcm  the  orig¬ 
inal  machine,  one  in  which  q  has  been  replaced  by  the  first  splitting  and  the 
other  in  which  it  has  been  replaced  by  the  second.  Let  us  then  consider  the 
grammars  derived  from  these  < wo  machines.  Suppose  that  the  base  state  of  the 
splitting  has  the  name  (X,  q>)  in  each  machine.  If  the  names  of  the  predicted 
nonterminals  of  the  two  splittings  are  Y^  and  then  since  the  base  will  be 
the  a-successor  of  the  predictive  in  either  machine,  the  base  will  also  have 
the  name  (Y ^ ,  a)  in  the  first  machine  nrd  (Y2 ,  c)  in  the  second.  In  the  first 
mach  ne,  the  base  state  of  the  splitting  has  .c  items,  so  it  has  a  a-successor; 
this  successor  will  have  two  names,  (X,  -pC)  anc  (Y^,  Clo).  In  t  second  ma¬ 
chine  however,  the  predictive  state  has  the  .o-items,  and  a  o~successor ;  its 
only  name  will  be  (Y.,,  a).  Thus  in  the  former  case  the  a  -transition  contri¬ 
butes  two  nonterminals  to  the  derived  grammar,  while  in  the  latter  it  gives 


rise  to  only  one. 


276 


For  example,  consider  the  LR(0)  grammar  S  -*  A,  A  -»  aB,  A  -»  b,  B  Ab.  The 
LR(0)  machine  for  this  grammar  is  shown  in  Figure  5.19,  and  has  one  cycle 


aQ 


S  .  A 

A  .  aB 

A  -*  ,b 

- 3—* 

A  -*  a.  B 
B  -*  .Ab 
A  -*  .  aB 

— - — >  B  -♦  A.b 

A 

b 

A  -»  .b 

i  1B 

Cii)  £3  ^ 

Figure  5.19 


in  it.  The  cycle  can  be  broken  either  by  predicting  a  B  or  by  predicting 
an  *  in  the  state  of  the  cycle;  in  either  r  >ise  the  vital  item  A  -*  .aB  is 
not  in  the  base  state.  The  two  resulting  machines  are  showr  in  Figure  5.20 
and  Figure  5.22.  According  to  our  -’rgisnent,  the  second  of  these  machines  is 
to  be  preferred,  because  it  has  the  smaller  base  state. 
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Figure  5^0 


277 


FC? 


Figure  5.  21 

To  check  this  we  compute  the  derived  grammars  fee  these  two  machines  In 
Figure  5.22. 


(S ,  -*  a (S ,  a) 

(S,  £  )  a(o,  a) 

(S,  <r)  -4  b (S ,  b) 

(S,  o  -4  b(s,  b) 

(S,  b)  -*  ( '  ,A) 

(S,  b)  -4  (S,  A) 

(S,  a)  -♦  0  ,  f)(S,  aA) 

(S,  a)  -»  (B,  £>(S,  aB) 

(S,  aA)  -4  b(S,  aAb) 

(S,  aB)  -4  (S,  A) 

(S,  aAb)  -»  (S,  aB) 

(S,  A)  -»  (S,  S) 

(S,  aB)  -♦  (S,  A) 

(S,  S)  -4  € 

(S,  A)  _♦  (S,  S) 

(B,  p)  -4  a(B,  a) 

(S,  S/  -♦  £ 

(B,  f)  +  b(B,  b) 

(A,  f)  -♦  a  (A,  a) 

(B,  a)  -»  (.B,  g)(B,  aB) 

(A,  f)  -♦  b(A,  b) 

(B,  aB)  (B,  A) 

(A,  a)  -4  (A,  f)(A,  aA) 

(B,  A)  -♦  b(B,  Ab) 

(A,  aA)  -♦  b(A,  aAb) 

(B,  Ab)  -4  (B,  B) 

(A,  aAb)  -4  (A,  aB) 

(B,  b)  -4  (B,  A) 

(A,  aB)  -4  (A,  A) 

(B,  B)  -♦  c 

(A,  b)  -»  'A,  A) 

(A,  A)  -»  fc- 

Figure  5. 22 

Inspection  of  the  e  two  granmars  verifies  that  the  second  machine  Js 

i  the  first.  The  first  derived  ■’  "aisiuar  has  15  nonterminals  and 


pre ferable 
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17  rule3,  while  the  second  has  13  nonterminals  and  15  rules.  Not  a  drastic 
improvement  perhaps,  but  suggestive.  We  could  have  predicted  this  difference 
by  examining  the  machines.  The  first  machine  has  two  more  states  as  succes¬ 
sors  of  the  base  state  than  the  second  machine;  these  states  thus  have  one 
extra  name  each  in  the  first  machine,  meaning  two  more  nonterminals  in  the 
first  grammar  than  in  the  second. 

To  conclude  this  section,  we  make  some  remarks  about  some  possible  gen¬ 
eralizations  of  our  model  of  MSP(k)  machines.  As  we  have  defined  and  developed 
the  theory,  an  MSP(k)  machine  is  basically  a  canonical  LR(k)  machine  that  is 
allowed  to  make  predictions  based  on  the  inspection  of  k  symbols  of  lookahead. 
We  put  a  number  of  restrictions  on  our  model  that  were  not  critical,  but  which 
just  made  the  formalisms  more  tractable;  now  we  shall  remove  some  of  these 
conditions.  We  shall  not  redo  the  entire  development  for  these  more  general 
machines;  it  will  be  fairly  clear  that  no  great  dislocations  occur  under  the 
proposed  modifications.  We  include  these  revised  models  here  because  they 
will  frequently  be  of  practical  utility;  it  will  often  be  the  case  that  a 
smaller  cycle-free  machine  can  be  found  if  these  restrictions  are  lifted  than 
if  they  are  not. 

First,  we  recall  the  more  or  less  arbitrary  restriction  that  any  initial 
state  of  an  MSP(k)  machine  could  be  associated  with  only  one  base  state  of  the 
machine.  This  was  done  so  as  to  ease  the  proof  that  the  derived  grammar 
generates  the  same  language  as  that  recognized  by  the  machine.  However,  it 
is  possible  to  relax  this  restriction  in  .some  circumstances,  without  invali¬ 
dating  the  result.  Namely,  if  two  identical  initial  states  are  associated 
with  two  different  base  states  and  the  predictive  languages  associated  with 
the  two  predictions  are  also  identical,  then  the  two  initial  states  can 
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really  be  considered  as  one.  This  is  a  most  valuable  feature  for  it  can 
significantly  reduce  the  number  of  initial  states  in  a  machine  and  envance  the 
size  of  the  derived  grammar.  The  relaxation  of  this  restriction  obviously 
affects  our  strategies  for  state-splitting;  it  now  becomes  moot  desirable 
to  split  two  states  ir.  such  a  way  that  the  splittings  share  a  common  initial 
state. 

For  example,  consider  the  states  of  Figure  5.23,  wHch  are  involved  in 
three  cycles.  It  is  possible  to  break  these  cycles  in  several  different  ways. 
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A  a.  B 

_ A _ v 

B  -»  A.C 

B  -♦  .AC 

C  -♦  .Bd 

B  -♦  .d 

B  -»  .AC 

A  -*  .  aB 

^  a 

B  -*  i 

A  -*  .b 

A  -*  .aB 

A  -»  .b 

Figure  5.23 


including  predicting  B  in  the  first  state  and  C  in  the  second,  which  would 
follow  our  earlier  stated  goal  of  keeping  as  little  as  possible  in  the  base 
states.  However,  it  is  also  possible  to  break  these  cycles  by  predicting  a 
B  in  both  states,  and  sharing  the  predictive  state  between  the  two  splittings, 
as  shown  in  Figure  5.24. 
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Figure  5.24 


This  machine  section  thus  has  only  one  predictive  state  rather  than  two,  which 


has  a  b  .’eficial  etfecL  on  the  derived  grammar.  Let  (X,  cp)  be  any  name  for  the 
leftmost  base  state  In  the  machine  at  large.  Then  the  machine  section  of 


Figure  5.24  will  give 


give 

rise 

to  l 

the  rules  < 
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(B, 
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b  (B 
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aB) 

(B, 

aB)  - 

i  (B 

,  A) 

(B, 

A)  -♦ 

(B, 

€)(B, 

AB) 

(B, 

AB)  -*  d (R,  ABd) 

(B, 

ABd) 

-»  (B,  AC) 

(B, 

AC)  -*  (B 

,  B) 

(B, 

B)  -+ 

€ 

Figure  5.2  5 


This  compares  very  favorably  with  the  grammar  that  would  result  if  we  split 
one  of  the  states  by  predicting  B  and  the  other  by  predicting  C.  In  that 
case,  this  section  of  the  machine  would  generate  18  nonterminals  and  20  rules, 
for  each  nome  (X,  cp)  of  the  first  base  state.  This  illustrates  why  the  process 
of  choosing  a  good  set  of  state-splittings  is  partly  an  intuitive  and  heuris¬ 
tic  procedure,  since  there  are  potentially  conflicting  standards  for  what  d2  • 


281 


fines  a  good  splitting.  On  the  one  hand,  we  like  to  keep  as  little  as  pos¬ 
sible  in  the  base  state;  on  the  other,  we  like  it  when  two  splittings  can 
share  a  predictive  state.  Anyone  trying  to  construct  a  "good”  cycle-free 
machine  by  performing  a  sequence  of  state-splittings,  must  keep  these  and 
other  criteria  in  mind,  and  strive  to  strike  a  balance  among  them. 

Another  restriction  we  made  very  early  on  was  that  only  lookaheads  of 
length  k  could  be  inspected  in  order  to  make  predictions  in  machines  whose 
states  were  composed  of  LR(k)  items.  This  was  to  ensure  that  every  item 
generated  some  lookahead  strings  of  the  appropriate  length;  that  made  all 
the  definitions  very  convenient  and  easy  to  state.  But  there  is  a  drawback 
to  this  simplifying  assumpcion.  In  general,  if  k^  >  k2 ,  then  there  are  many 
more  LR(k^)  items  for  a  given  granmar  than  there  are  LR(k2>  items.  So  fre¬ 
quently  the  LR(k^)  machine  for  a  grammar  will  be  much  larger  than  the  LR'^'' 
machine  and  therefore  a  cycle-free  machine  obtained  from  the  former  may  well 
have  more  states  than  one  obtained  from  the  latter.  Hence  we  would  prefer 
to  construct  a  cycle-free  MSP (k2)  machine  .  However,  it  tea*  be  che  case  that 
’*2  symbols  do  not  provide  sufficient  lookahead  in  order  to  split  states  in 
the  LR(k2)  machine,  while  k^  symbols  would  suffice.  We  would  like  to  use  k^ 
symbols  of  lookahead  to  make  a  prediction  in  a  machine  whose  states  are  sets 
of  LR(k2)  items.  Even  though  this  violates  owe  of  our  restrictions,  there  is 
nothing  wrong  with  this  in  principle,  provided  we  are  careful  with  the  iden¬ 
tity  of  tie  lookahead  set. 

For  example,  consider  the  LR(1)  state  of  Figure  5.26.  It  is  impossible 
to  break  this  cycle  using  only  one  symbol  of  lookahead,  since  there  is  an  es¬ 
sential  .a-itan  and  a  must  be  in  the  predictive  language. 
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a.B(b) 
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-4 

a.ac (b) 
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-» 

,Ab(b) 
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-♦ 

.  aB(b) 

A 

-♦ 

.aac (b) 

Figure  5.2  6 


However,  we  can  make  a  prediction  based  on  two  symbols  of  lookahead;  namely 
predict  B  upon  seeing  aa.  The  splitting  is  shown  in  Figure  5.27, 


Figure  5.27 

In  order  to  describe  what  we  are  doing  in  cases  like  this,  we  want  to 
modify  our  definition  of  prediction  and  state-splitting  so  that  it  becomes 
meaningful  to  predict  an  A  in  ar  LRCkj)  state  upon  seeing  appropriate  k  - 
length  lookaheads.  We  could  try  to  modify  the  definitions  in  the  most  straight¬ 


forward  way:  namely,  by  c ranging  expressions  like  FIRST,  (A,  B)  to  FIRST  ^A,  6)  in 

k2  k1 

the  definition  of  the  predictive  language.  The  problem  with  this  is  that  if 

B  is  a  base  state  consisting  of  LR(k„)  items,  then  FIRST.  (A,  B)  may  not  be 

kl 

defined.  Recall  that  FIRST,  (A,  B)  is  tla  union  of  FIRST,  for  all  .A  items 

kl  kl 

in  B;  but  if  A  generates  short  strings,  c'.en  an  item  like  C  -*  0U  A(w)  where 

w  £  V  2  ,  may  not  have  a  defined  FIRST,  ;  Aw  may  not  generate  any  strirgs 
T  kj 
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of  length  greater  than  k^.  That  was  the  advantage  of  using  LR(k)-it«ns  to 

ao  k-lookahead  prediction;  every  such  item  was  assured  of  generating  k- 

length  strings.  So  the  question  becomes,  how  do  we  go  about  defining  FIRST, 

kl 

on  LR(k?)  items  where  k^  >  k£? 

There  are  a  variety  of  ways  of  varying  accuracy  in  which  we  can  do 
this.  Any  of  these  approaches  will  work  in  the  sense  that  machines  and  gram¬ 
mars  constructed  using  any  of  these  definitions  will  exhibit  the  desired  pro¬ 
per  *.ies.  Some  ways  are  more  precise  than  others,  however,  in  capturing  the 
precise  meaning  of  FIRST^  of  an  LR^)  item. 

As  a  first  attempt,  we  could  simply  define  FIRST^  (A  -»  a^.O^^))  as 
FIRST  (a9  FOLLOW  (A));  in  other  words,  we  are  defining  the  lookahead  of  an 
item  in  terms  of  what  it  can  produce,  without  paying  much  attention  to  che 
particular  context  in  which  it  appears.  That  is ,  m  is  really  the  precise 

context  of  the  core  of  this  item,  and  is  a  particular  member  of  FOLLOW  (A); 

k2 

we  might  choose  to  forget  this  bit  of  information  and  juct  consiler  the  full 
set  of  strings  that  can  follow  A  as  the  context  for  this  item. 

A  little  more  accurately,  if  A  -*  .a(u)  is  an  immediate  descendant  of 
B  -+  p  .AP9(t),  we  might  define  FIRST,  (A  -*  .a(w))  as  being  FIRST,  (ap  FOLLOW 

i  t  kj  k^  /  k^ 

(B));  that  is,  we  recognize  that  the  A  on  the  left-hand  side  of  this  item 
is  not  any  A,  but  the  A  in  3  P^.AP2(t);  so  the  context  for  this  A  is  any¬ 

thing  generated  by  po,  followed  by  anything  that  can  follow  a  B. 

4 

But  we  have  still  not  captured  the  information  that  whatever  the  k^- 

oonte: :t  of  this  item  may  be,  its  k2"prefix  is  known  to  be  w.  This  can  be 

accounted  for  by  defining  FIRST,  (A  -♦  .a(^))  as  FIRST,  (aX),  where  X  = 

kl  kl 

{x  |  x£  FIRST.  (Pt  FOLLOW,  (B))  and  x/k„  =  w).  However,  we  are  still  being 
r]  z  ki  l 
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careless  about  tha  context  of  the  B,  •  "lag  its  full  follow  set,  even  though 

we  should  be  restricting  it  to  a  more  specific  context, namely  t.  We  could 

begin  to  remedy  this  by  using  just  those  elements  of  FOLLOW,  (B)  that  begin 

kl 

with  t,  but  that  is  only  part  cf  the  solution.  To  get  really  precise  we 
should  make  the  following  definition. 

Definition  5.30  The  x^-context  of  an  LR(k0)  item,  where  k^  >  k^,  is  defined 


as  follows: 


i)  If  the  item  is  A  -*  a.  P(<s),  its  k^context  is  the  same  as 
A  .♦  .ap(u). 

ii)  If  the  item  is  A  -*  .a  (— lk2),  its  k^-context  is  — }kl. 

iii)  If  A  -4  .a(^)  is  an  immediate  descendant  of  B  -»  p  ,AP2(t),  then 
the  k^-context  of  A  -♦  .a(<J)  includes  all  strings  y  in  FIRST^  (P2X),  where  X 

is  the  kj-context  of  B  P^.AP^t),  such  that  y/k2  =  tr. 

Tlien  if  we  talk  about  FIRST,  of  an  item  a  -»  .a(w),we  mean  FIRST  (aX) , 

1  kl 

where  X  is  the  k^-centext  of  the  item.  But  even  this  can  be  refined  further. 
For  purposes  of  state-splitting,  we  are  not  really  interested  in  the  k^-coatext 
of  an  item,  but  only  its  k^-context  with  respect  to  that  state;  namely  the 
possible  looksh  jS  that  might  be  found  after  the  core  of  this  icem  is  recog¬ 
nized  by  this  state  of  the  machine.  This  will  generally  be  a  strict  subset  of 
the  full  k^-context  of  the  item;  the  previous  definition  can  be  modified  to 
reflect  this  additional  restriction. 

We  repeat  that  any  of  these  definitions  of  FIRST  could  be  used  in  or- 

kl 

der  to  define  a  state-splitting  of  an  LR(k2)  state  using  k^  symbols  of  look¬ 
ahead,  and  thi  results  derived  in  the  earlier  chapters  will  still  pertain. 

That  is,  we  can  choose  Co  predict  an  A  on  seeing  t*;,  provided  there  is  a  .A 
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item  in  every  chain  that  causes  a  k^-lookahead  of  u>,  even  though  in  this  state 
there  will  never  really  be  a  lookahead  of  u,  and  that  some  chains  seem  to  be 
causing  lookahead  of  only  because  of  the  inexactness  of  the  computation  proce¬ 
dure  for  w'uat  chains  do  cause  as  lookahead.  In  other  words,  there  is  no  harm 
in  making  predictions  on  spurious  lookaheads,  so  long  as  the  appropriate  ac¬ 
tion  is  taken  for  real  lookaheads.  The  only  problem  ivith  using  the  grosser 

definitions  of  FIRST  is  thac  they  might  prevent  us  from  splitting  states 

K1 

that  might  legitimately  be  split  using  a  more  accurate  definition. 

We  can  take  this  idea  one  step  further,  and  use  k^  symbols  of  lookahead 
not  only  to  splits  states  of  an  LR(k2)  machine,  but  also  the  states  of  an 
SLR(k2)  machine.  [4]The  states  of  an  SLR(k)  machine  are  composed  entirely  of 
LR(0)  items;  k  symbols  of  lookahead,  computed  in  various  ways  resembling 
our  computation  of  FIRST^  of  LR(k2>  items,  may  be  used  to  determine  which 
state  to  transfer  to  during  the  parse.  We  can  easily  add  on  to  this  model 
the  additional  concept  of  consulting  some  symbols  of  lookahead  for  predictive 
purposes.  As  before,  the  set  of  strings  that  are  to  occasion  prediction  of 
an  A  are  to  include  at  lea?  all  strings  of  length  k  that  might  be  sighted 
upon  entry  to  the  state  and  that  might  have  some  prefix  derived  from  an  A. 

We  can  use  the  previously  described  methods  to  dc  such  computation.  If  a 
valid  prediction  can  be  made  on  such  lookaheads,  then  the  result  of  replacing 
a  state  of  an  SLR(k)  machine  by  a  splitting  induced  by  such  a  prediction  is  a 
variant  sort  of  MSP(k)  machine,  which  will  operate  in  the  usual  manner;  if  a 
sequence  of  such  splittings  produces  a  cycle-free  machine,  a  grammar  can  be 
derfv*  t  from  this  machine  and  it  will  have  all  the  customary  desirable  proper¬ 
ties.  This  del  ved  gransnar  will  have  one  additional  powerful  attraction: 
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its  size.  Since  an  SLR(k)  machine  for  G  (if  it  exists)  is  usually  much 
smaller  than  the  UR(k)  machine  for  G,  the  sizes  of  the  granmars  derived  from 
MSP(k)  machines  constructed  from  these  two  machines  bear  a  similar  relation¬ 
ship,  frequently  to  a  significant  degree. 
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CHAPTER  SIX 

TOPICS  FOR  FURTHER  RESEARCH 

In  this  thesis,  we  defined  and  studied  a  new  mode  of  parsing,  which  was 
a  very  general  hybrid  of  bottora-up  and  top-down  deterministic  parsing  schemes. 
We  developed  a  model  for  this  method  of  parsing  in  the  form  of  MSP(k)  machines, 
and  then  focused  our  attention  on  a  particular  kind  of  MSP(k)  machine, 
namely  those  which  contained  no  cycles.  We  showed  that  if  a  grammar  G  could 
be  parsed  by  a  cycle-free  MSP(k)  machine  M,  then  we  could  derive  an  LL(k) 
grammar  TM(G),  whose  nature  depended  on  M  and  G,  and  which  was  equivalent 
to  G.  We  examined  the  properties  of  T^(G),  especially  the  kinds  of  transla¬ 
tions  it  could  support,  and  found  ways  to  make  it  a  more  manageable  size. 
Finally,  we  tried  to  get  some  feeling  for  the  class  of  grammars  that  can  be 
so  transformed,  and  gained  some  insight  into  the  problem  of  actually  finding 
the  cycle-free  MSP(k)  machine  for  such  grammars.  While  we  undoubtedly  left 
some  questions  unanswered,  we  feel  that  we  have  in  large  measure  succeeded  at 
the  task  vrtiich  we  set  ourselves,  namely  to  discover  and  study  a  new  transform  1- 
tion  to  convert  non-LL(k)  grammars  into  LL(k)  form.  Rut  in  a  sense,  the  most 
satisfying  and  exciting  part  of  this  research  has  been  the  fact  that  a  large 
number  of  collateral  issues  have  bean  raised  in  the  course  of  the  prepara¬ 

tion  of  this  work.  It  is  sometimes  said  that  an  important  measure  of  a  ,'iece 
of  reasearch  is  not  the  number  of  old  questions  it  answers,  but  the  number  of 
interesting  new  questions  which  it  asks.  Wa  shall  now  consider  some  unanswered 
questions  that  are  related  to  the  work  reported  in  this  thesis.  These  range 
from  speculations  on  mild  generalizations  of  our  work  to  rather  extensive  modi¬ 
fication*?  an  expansions  ■  '  our  basic  ideas. 
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CHAPTER  SIX 

TOPICS  FOR  FURTHER  RESEARCH 

In  this  thesis,  we  defined  and  studied  a  new  mode  of  parsing,  which  was 
a  very  general  hybrid  of  bottom-up  and  top-down  deterministic  parsing  schemes. 
We  developed  a  model  for  this  method  f  parsing  in  the  form  of  MSP{k)  machines, 
and  then  focused  our  attention  on  a  particular  kind  of  MSP(k)  machine, 
namely  those  which  contained  no  cyclts.  We  showed  that  if  a  granmar  G  could 
be  parsed  by  a  cycle-free  HSP(k)  machine  M,  then  we  could  derive  an  LL(k) 
grammar  1^(0,  whose  nature  depended  on  M  and  G,  and  which  was  equivalent 
to  G.  We  ex-aained  the  properties  of  T^(G),  especially  the  kinds  of  transla¬ 
tions  it  could  support,  and  found  ways  to  make  it  a  more  manageable  size. 
Finally,  we  tried  to  get  some  feeling  for  the  class  of  grammars  that  can  be 
so  transformed,  and  gained  some  insight  into  the  problem  of  actually  finding 
the  cycle-free  MSP(k)  machine  for  such  grammars.  While  we  undoubtedly  left 
some  questions  unanswered,  we  feel  that  we  have  ir  large  measure  succeeded  at 
the  task  which  we  set  ourselves,  namely  to  discover  and  study  a  new  transforma¬ 
tion  to  convert  non-LL(k)  grammars  into  LL(k)  form.  But  in  a  sense,  the  most 
satisfying  and  exciting  part  of  this  research  has  been  the  fact  that  a  large 
number  of  collateral  issues  have  been  raised  in  the  course  of  the  prepara¬ 

tion  of  this  work.  It  is  sometimes  said  thet  an  important  r?asure  of  a  piece 
of  reasearch  is  not  the  number  of  old  questions  it  answers,  uut  the  number  of 
interesting  new  questions  which  it  asks.  We  sha1'  now  consider  some  unanswered 
questions  that  are  related  to  the  work  reported  in  this  thesis.  These  range 
from  speculations  or.  mild  generalizations  of  our  work  to  rather  extensive  modi¬ 
fications  and  expansions  of  our  basic  ideas. 
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One  piece  of  unfinished  business  is,  of  course,  the  development,  imple¬ 
mentation.  and  analysis  of  an  actual  compiler-compiler  containing  a  trans¬ 
formation  phase  embodying  the  ideas  we  have  developed.  Such  an  effort  could 
j.ead  to  an  evaluation  of  the  practicality  and  utility  of  the  work  we  have  done. 
Construction  of  such  a  system  would  also  provide  the  impetus  for  a  deeper  un¬ 
derstanding  of  th.“  process  of  constructing  a  cycle*free  MSP(k)  machine  for  a 
grammar;  faster,  more  efficient  algorithms  for  selecting  state-splittings  and 
tor  breaking  cycles  would  have  to  be  developed.  Also,  more  precise  informa¬ 
tion  would  be  needed  about  the  trade-o.fs  involved  in  reducing  the  size  of 
a  derived  grammar.  In  a  sense,  these  issues  are  loose  ends  of  our  work  wf)ich 
would  become  more  interesting  in  the  context  of  an  implementation  effort. 

A  more  theoretical  question  concerns  the  extent  of  the  k-transformable 
gramnars.  Is  there  some  other  way  of  characterizing  this  class  cf  grammars? 

We  have  seen  that  the  class  of  k- transformable  granmars  strictly  Includes  the 
class  of  LC(k)  grammars.  Tn  other  words,  we  know  that  if  the  Rosenkrantz 
transformation  can  convert  a  gramnar  Into  LL(k)  form,  then  tnat  grammar  is 
k-transformable.  Does  this  result  extend  to  other  conventional  transforma¬ 
tions?  In  particular  i.t  would  be  good  If  we  could  relate  the  class  of  k- 
trans  f ormab le  grammars  to  the  class  of  grammars  that  can  be  converted  into 
LL(k)  :orm  by  the  application  of  some  sequence  of  transformations  drawn  from 
a  collection  of  standard  transfo:mations.  A  canonical  "bag  of  tracks"  might 
include  substitution  and  left  factoring,  as  well  as  the  Rosenkrantz  transfor¬ 
mation’  these  are  the  most  common  techniques  currently  employed  in  heuristic 
efforts  to  achieve  an  LL(k)  form  for  n  granmar  [20].  A  result  relating  the 
k-t rails fn’—^b la  gramnars  to  thic  class  of  grammars  might  eliminate  the  need 
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for  the  haphazard  application  of  conventional  transforming  tricks. 

There  is  another  conjecture  about  the  extent  of  the  k- transformable 
grammars,  suggested  by  R.  E.  Steams,  which  effectively  suggests  that  the 
k- transformable  grammars  are  precisely  the  class  of  grammars  that  it  pays 
to  transform  into  LL(k)  form.  It  may  be  stated  in  the  following  way.  Let 
us  mean  by  the  simple  Polish  punctation  of  a  grammar,  a  translation  grammar 
where  the  tra.—  lation  element  of  any  rule  consists  of  the  nonterminals  of  the 
rule  followed  by  a  terminal  symbol  unique  to  the  rule.  (One  can  think  of 
this  symbol  as  the  name  of  the  action  routine  to  be  called  when  this  rule 
is  recognized.)  The  c.-jecture  states  that  if  G  is  an  LR(k)  grammar  and  if 
there  exists  an  LL(k)  grammar  G* ,  equivalent  to  G,  with  some  simple  transla¬ 
tion  grammar  for  G*  equivalent  to  the  simple  Polish  punctation.  of  G,  then  G 
is  k- transformable.  This  conjecture  says  that  if  there  is  some  LL(k)  grammar 
which  is  as  useful  for  compilation  as  G,  then  that  grammar  can  be  found  by 
application  of  our  transfo~mation. 

Clearly,  in  order  to  make  any  satisfactory  progress  in  this  area,  we 
nn’st  have  the  appropriate  definition  for  k- transformable  grammar.  It  may  be 
necessarv  to  further  generalize  our  definitions  of  prediction,  state-splitting, 
and  MSP(k)  machine,  and  refine  our  granmatical  derivation  process,  in  order 
to  achieve  desiraole  results.  We  have  already  seen  several  different  versions 
of  our  basic  ideas;  for  example,  at  first,  prediction  had  to  be  of  all  A's 
in  a  state,  and  then  we  allowed  it  to  be  of  just  certain  specific  A's.  It 
may  be  useful  to  try  a  further  genera  if zation,  and  allow  an  A  to  bt  predicted 
by  some  subset  of  its  lookahead  language.  This  wi’,  1  clearly  expand  the  class 
of  permissible  state  splittings  and  hence  the  class  of  k- transformable  g’-am- 
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mars . 

For  example,  consider  the  LR(1)  state  of  Figure  6.1,  which  is  its  own 
a-successor.  According  to  our  conventional  definition  of  prediction,  this 


A  -*  a.b(d) 
A  .+  a.A(d) 
A  _*  .  ab  (d) 
A  . aA\d) 
A  -*  .  be  (d  ) 


Figure  6.1 


state  cannot  be  split  using  one  symbol  of  lookahead,  since  b  £  FIRST^(A) 

but  b  does  not  indicate  the  presence  ofA  .  But  the  symbol  a  does  indicate  the 

presence  of  A,  and  induces  the  splitting  of  Figure  6.2,  There  are  numerous 

- 1 

A  -»  a .  b  (d  ) 

A  -*  a.A(d) 

A  -*  .  be  (d  ) 

r  1 

I 

/a  I 

± - , 


A  .ab(d) 
A  .  aA(d) 


Figure  6. 2 


and  nontrivial  technical  difficulties  in  trying  to  generalize  the  notion  of 
state-splitting  in  this  way,  particularly  in  computing  the  successors  of  com¬ 
ponents  of  the  splitting;  but  further  research  might  provide  solutions  to 
the  problems. 

It  might  be  interesting  to  consider  even  more  radical  variations  of  our 
notion  of  prediction.  Consider  a  machine  model  that  operates  as  follows.  Upon 
entry  to  a  base  state  and  inspection  of  t^e  lookahead,  a  prediction  can 
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be  made  and  transfer  effected  to  a  submachine. 

Upon  fulfillment  of  the  prediction  by  the  submachine,  corcr-I  is  returned  to 
the  celling  base  state;  but  rather  than  i'mediately  passing  to  a  successor, 
the  base  state  can  again  inspect  the  lookahe'.d  and  then  possibly  make  another 
prediction.  In  other  words,  a  base  state  can  hava  several  predictive  states 
associated  with  it,  which  may  be  transferred  to  sequentially.  The  potential 
value  of  such  a  generalization  is  clear.  It  might  happen  upon  entry  to  a 

state  that  with  the  lookaheads  availaDle,  only  a  relatively  "low-level”  pre¬ 

diction  can  be  made,  not  enough  to  break  a  cycle  in  which  the  state  occurs; 
but  after  this  first  prediction  is  fulfilled,  a  more  useful  prediction  can 
then  be  made. 

Of  course,  the  grammatical  derivation  procedure  would  have  to  be  al¬ 
tered  to  accommodate  this  revised  machine  model.  This  might  be  done  as  fol¬ 
lows.  Suppose  some  base  state  can  make  a  prediction  of  a  Y;  after  the  Y 

has  been  found,  the  base  state  can  predict  a  W  or  a  Z,  depending  on  the  look- 

ahead;  and  after  either  of  these  predictions  is  fulfilled  the  base  state 
makes  no  more  predictions,  but  relinquishes  control  to  a  successor  state. 

Then  if  (X,  m)  is  a  name  of  the  base  state,  the  derived  grammar  will  have 
rules  (X,  q>)  ->  (Y,  €)(X,  cp)';  (X,  cp)  ->  (Z,  Y)(X,  cpZ);  (X,  cp )  ’  -+  (W,  Y) 

(X,  cpW).  Here  (X,  cp)  '  is  a  new  nonterminal,  which  holds  a  place  in  the  deri¬ 
vation  until  it  *z  determined  whether  W  or  Z  is  really  going  to  be  found. 
The  prediction  of  ^he  Y  is  just  an  auxiliary  to  the  later  prediction.  When 
the  later  prediction  is  made,  it  is  not  the  nonterminal  (Z,  g)  or  (W,  £)  that 
gets  introduced  into  the  derivation,  for  Y  has  already  been  found;  (W,  Y)  is 
introduced,  to  indicate  that  we  are  rather  belatedly  predicting  W,  after  al¬ 
ready  finding  Y,  and  that  we  now  have  to  find  the  rest  of  W  after  Y. 
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Such  a  notion  of  retroactive  prediction  introduces  numerous  complica¬ 
tions  into  the  machine  model,  and  requires  substantial  revision  of  our  formal 
development. 

But  some  generalization  along  these  lines  will  be  necessary  to  carry 
our  concept  of  prediction  in  bottom-up  parsing  to  its  natural  conclusion,  and 
thus  define  the  largest  possible  class  of  k-transformable  grammars. 

Consider  the  grammar  of  Figure  6.3;  part  of  its  LR(0)  machine  is  shown 
in  Figure  6.4. 

S  -♦  A 
S  .*  C 
A  -»  BX 
A  -♦  BY 
C  -*  Bd 

Figure  6.3 


B  -♦  (B) 
B  -*  b 
X  -♦  BA 
Y  -»  BC 


There  is  a  cycle  in  this  part  of  the  machine;  and  clearly  no  amount  of 
lookahead  could  possibly  be  used  to  make  a  prediction  that  would  split  a  state 
in  such  a  way  that  would  break  the  cycle.  In  state  1,  the  vital  items  are 
X  -»  .BA  and  Y  -»  .BC;  but  no  amount  of  lookahead  can  distinguish  X  from 
Y,  since  they  both  begin  with  B,  which  can  generate  indefinitely  long  strings. 


Figure  6.4 
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Similar  comments  apply  to  A  and  C  in  state  2.  It  is  easy  to  see  that  noth¬ 
ing  would  be  gained  by  constructing  LR(k)  machines  for  k  >  0,  and  trying  to 
use  lookahead  to  break  the  cycles  in  those  machines,  since  the  same  situation 
will  obtain  there.  Clearly  this  grammar  is  not  k-transformable  for  any  k. 

However,  the  cycle  could  be  broken  using  the  hierarchical  prediction 
scheme  just  described.  Upon  entry  to  state  2,  we  can  safely  predict  a  3, 

After  the  B  has  been  found,  we  can  look  ahead  and  predict  once  again;  if 
the  lookahead  is  the  symbol  d,  then  the  B  we  have  just  found  was  the  fii3t 
part  of  a  C,  otherwise  it  was  the  beginning  of  an  A.  The  grammar  rules 
that  would  be  derived  from  a  splitting  based  on  such  a  prediction  would  be 
(S,  BB)  .+  (B,  c)(S,  BB) (S,  BB)  *  -+  (C,  B)(S,  BBC);  (S,  BB)  *  -*  (A,  B) 

(C,  BBA);  these  could  well  be  part  of  an  LLCk)  grammar. 

It  is  significant  that  the  grammar  of  Figure  6.3  ce  be  transformed  into 
LI.(l)  form  by  other  means.  By  doing  some  substitutions,  a  left  factoring, 
and  then  applying  the  Rosenkrantz  transformation,  the  result  is  LL(1).  Thus 
in  order  for  our  transformation  to  eliminate  the  need  for  all  others,  some 
generalization  along  the  lilies  just  described  will  be  necessary. 

We  might  go  very  far  afield  in  another  direction,  and  drastically  re¬ 
vise  our  notion  of  what  constitutes  a  1  gal  state-splitting.  On  the  one  hand, 
we  can  consider  allowing  the  predictive  languages  of  two  initial  sr.ates  of  a 
splitting  to  intersect,  while  keeping  the  rest  of  the  transformation  procedure 
the  same.  Th*  only  effect  of  this  alteration  will  be  to  destroy  the  LL0O“ 
ness  of  the  derived  grammar.  This  might  prove  useful,  in  enabling  us  to  apply 
the  transformation  to  an  LR(k)  grammar  for  a  non-LL(k)  language;  the  trans¬ 
formed  grammar  will  generate  the  same  language  as  the  original  grammar,  but 
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of  course  it  will  not  be  LL(k).  However,  it  might  exhibit  other  useful  and 
interesting  properties. 

Another  possibility  would  be  to  construct  a  predictive  state  not  for  a 
single  nonterminal  but  for  a  whole  collection  of  than,  Then  the  derived  gram¬ 
mar  will  indeed  be  LL(k),  but  it  will  not  necessarily  generate  the  same  lang¬ 
uage  as  the  original  grammar  from  which  the  machine  was  constructed.  In  this 
way  we  can  approach  the  idea  of  an  LL(k)  approximation  to  a  non-LL(k)  language. 
This  could  prove  to  be  a  very  attractive  concept.  We  could  transform  a  non- 
LL(V)  gramnar  into  an  LL(k)  grsmmav  for  a  different  but  closely  related  lang¬ 
uage,  by  constructing  an  MSP(k)  machine  that  doesn '  t  make  very  precise  predictions  , 
and  deriving  a  grammar  from  that  machine.  Then  hopefully  we  T?ould  have  an 
LL(k)-driven  compiler  that  correctly  parses  any  legal  program,  though  it 
might  possibly  accept  some  illegal  programs  in  addition.  We  might  catch  these 
illegal  programs  with  a  preprocessor  at  an  earlier  stage,  paying  the  price, 
of  this  two-stage  complexity  for  the  advantage  of  having  a  well-designed  and 
efficient  compiler. 

As  ar,  example  of  these  ideas,  consider  the  grammar  or  Figure  6,5.  Al¬ 
though  this  grammar  is  LR(0),  the  language  that  it  generates  in  (a1^  U  ancn] , 
which  is  known  not  to  be  LL(k)  for  any  k.  The  lil(W)  machine  for  this  grammar 

S  -*  A  A  -*  ab 

S  -t  B  B  -»  a  Be 

A  -*  aAb  B  -♦  ac 

Figure  6-5 


is  given  in  Figure  6.6. 
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Figure  6.6 

The  first  approach  just  described  would  have  us  split  the  state  as  shown 
in  Figure  6.7.  The  grammar  generated  from  the  machine  containing  this  split- 


Figure  6.7 

ting  would  contain  the  rules  (S,  a)  -♦  (A,  g)(S,  aA)  and  (S,  a)  -*  (B,  g)(S,  aB). 
This  would  prevent  the  resulting  grammar  from  being  LL(k),  because  A  and  B 
both  generate  indefinitely  long  strings  of  as. 

Another  possibility  would  be  to  create  a  splitting  as  shown  in  Figure 
6.8.  The  name  of  this  predictive  state  will  be  neither  (A,  g)  nor  (B,  £),  but 
some  new  name  (X,  f).  The  derived  granmar  will  have  rules  (S,  a)  -*  (X,  £)(S,  aX); 
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Figure  6.8 

(X,  g)  .4  a(X,  a);  (X,  a)  -♦  (X,  e)(X,  aX);  (X,  a)  -*  b(X,  ab);  (X,  ab)  -♦  (X,  A); 
(X,  A)  -»  £;  and  so  on.  In  other  words,  X  stands  either  for  A  or  for  B,  which¬ 
ever  is  appropriate  in  the  context,  This  derived  grammer  will  be  LL(1),  but  it  will 
generate  a  new  language,  namely  an(b  U  c)U,  which  is  different  from,  but  close¬ 
ly  related  t j,  the  original  language.  We  note  that  anbn  U  ancn  =  an(b  U  c)n  fi 
(a*b*  U  a*c*);  so  a  finite-state  machine  preprocessor,  letting  through  only 
strings  in  the  set  a*b*  U  a*c*,  followed  by  the  derived  LL(1)  parser,  will  cor¬ 
rectly  process  the  language  3%°  U  anc.n. 

Another  exciting  prospect  is  that  of  applying  some  variant  of  our  trans¬ 
formation  procedure  to  non-LR(k)  grammars.  Here  of  course  we  could  not  start 
with  an  IH(k)  machine  for  the  grammar  and  try  to  achieve  a  cycle-free  machine 
by  state-splitting.  However,  we  could  attempt  to  construct  an  LR(k)  machine 
for  the  grammar;  the  result  will  be  a  bottom-up  nondeterministic  ptrser.  If 
we  can  split  states  and  get  a  cycle-free  version  of  this  machine,  we  can  pro¬ 
ceed  to  read  off  a  ammar.  Of  course  this  grammar  will  not  be  LL(k),  reflec¬ 
ting  the  nondeterminacy  of  the  machine's  operation;  but  we  have  discussed  a 
similar  notion  in  nc  .  ..ierministic  prediction  above.  Ho-ever,  it  may  well  be 
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that  the  grammar  we  derive  from  this  nondeterministic  MSP(k)  machine  will  be 
LR(k)  for  some  k. 

For  example,  consider  the  grammar  S  -»  ADx,  S  ^  BDy.,  A  -*  a,  B  -»  a,  D  _»  Dd, 
D  -j,  d.  This  grammar  is  not  LR(k)  for  any  k,  since  whether  to  reduce  a  to  A  or 
B  can  only  be  decided  by  looking  past  the  D,  which  cannot  be  done  with  finite 
lookahead.  The  quasi-LR(O)  machine  for  this  grammar  is  shown  in  Figure  6.9. 


Figure  6.9 


While  this  machine  may  not  be  deterministic,  it  is  cycle-free,  and  we  can  de- 


grammar  from  it 

,  which  is 

shown  in  Figure 

6.10. 

<S, 

o  + 

a(S 

»  a) 

(s, 

AD)  •+ 

x(S, 

ADx) 

(S, 

a)  -* 

(S, 

A) 

(s, 

AD)-/* 

d(S, 

ADd) 

(S, 

a)  -* 

(S, 

B) 

(s, 

BD)  -*■ 

y(s. 

BDy) 

(S, 

A)  •* 

d(S 

,  Ad) 

(s. 

BD)  -» 

d(s. 

BDd) 

(S, 

B)  -♦ 

d(S 

,  Bd) 

(S, 

ADx)  (S, 

s) 

(S, 

Ad)  -»  (S 

,  AD) 

(s, 

ADd)  -♦  (S, 

AD) 

(S, 

Bd)  -»  (S 

,  BD) 

(S, 

BDy)  -♦  (S, 

S) 

(S, 

S)  -» 

€ 

(S, 

BDd)  - 
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Flgure  6.10 

This  granmar  is  not  LL(k),  because  of  the  rules  (S,  a)  -*(S,  A)  and  (S,  a)  -» 


(S,  B);  this  reflects  the  non-determinacy  of  the  machine's  operation.  How¬ 
ever  it  is  LR(0) .  It  would  be  extraordinarily  useful  if  we  cculd  find  a  way 
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to  convert  certain  non-LR(V.)  gramnars  for  LR(k)  languages  into  LRCk)  form, 
for  that  would  free  a  programming  language  designer  from  the  constraint  of 
having  to  make  his  original  language  specification  in  terms  of  an  LR(k)  gram¬ 
mar.  Admittedly  this  is  not  a  very  onerous  restriction,  and  removing  it 
risks  the  possibility  of  the  designer  coming  up  with  an  ambiguous  grammar, 
but  the  concept  is  appealing  nonetheless. 

Ihere  is  also  the  possibility  of  altering  the  procedure  by  which  the  de¬ 
rived  grammar  is  obtained  from  a  machine.  Our  current  approach  requires  the 
construction  of  a  cycle-free  machine  for  the  process  to  go  forward.  Strictly 
speaking,  the  grammar  derivation  procedure  could  read  rules  off  a  machine 
with  cycles  just  as  easily  as  from  a  cycle-free  machine;  the  cycle-free  re¬ 
quirement  is  only  to  ensure  that  there  will  be  finitely  many  nonterminals  in 
the  grammar.  it  might  be  possible  to  devise  a  state-naming  procedure  for 
certain  machines  with  cycles,  such  that  each  state  would  get  only  finitelv 
many  names;  hence  a  well-defined  grammar  could  be  constructed  from  the  machine. 

The  possibilities  of  extending  our  transformation  raise  numerous  ques¬ 
tions  about  successive  applications  of  the  transformation.  We  might  consider 
a  hierarchy  of  classes  of  grammars,  che  nth  level  being  those  that  can  be 
transformed  into  LL  form  oy  n  applications  of  the  transformation.  The  extent 
of  the  hierarchy,  relationship  between  different  levels  in  it,  and  other  such 
questions  would  be  of  great  theoretical  interest  with  considerable  practical 
significance. 

Finally,  it  might  prove  interesting  to  conduct  a  theoretical  investiga¬ 
tion  of  a  racher  abstract  model  of  MSP(k)  machines.  We  could  define  an  automa¬ 
ton  called  a  recursive  finite  state  machine  (RFSM),  consisting  of  a  set  of 
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finite  state  machined  whici  have  the  added  feature  that  they  c  m  call  each 
other.  Such  calls  are  effected  by  inspection  of  lookahead  symbols;  upon 
entry  to  one  of  its  final  states,  a  submachine  returns  control  to  the  state 
that  called  it.  For  each  value  of  k,  we  could  define  the  class  of  k-RFSMs  as 
being  those  that  inspect  k  symbols  of  lookahead  in  deciding  which  submachine 
to  call.  A  similar  model  has  been  proposea  by  several  authors  including 
Tixier  [21 j,  and  has  been  studied  extensively  by  Lomet[16];  but  in  Lomet's 
model,  a  subicachine  is  capable  of  transmitting  information  to  its  caller  when 
it  returns.  Given  this  addit_onal  feature,  it  can  be  shown  that  such  mf-;;u.nes 
can  accept  the  full  class  of  LR(k)  languages  It  would  be  interesting  to  know 
what  c)a-.s  of  languages  our  more  restricted  RFSMs  accept,  for  our  model  seems 
very  simils-  to  the  canonical  pushdown  macnines  cf  Rosenkrantz  and  Steams 
[18]  (basically  1-state  PDA's  with  lookahead);  these  in  turn  are  excellent 
mxlels  for  t.-e  construction  of  parsers,  being  compact,  efficient,  and  easy  to 
implement. 

The  foregoing  discussion  gives  a  feeling  for  some  of  the  issues  that  have 
been  raised  during  the  course  of  our  research  and  which  remain  unanswered.  t't 
is  hoped  that  whatever  limited  success  we  have  achieved,  and  the  potential 
utility  of  these  investigations,  will  encourage  ethers  to  take  up  where  we 


have  left  off. 
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CHAPTER  SIX 

TOPICS  FOR  FURTHER  RESEARCH 

In  this  thesis,  ye  defined  and  studi-ul  a  new  mode  of  parsing,  which  was 
a  very  general  hybrid  of  bottom-up  and  top-down  deterministic  parsing  schemes. 
We  developed  a  model  for  this  method  f  parsing  in  the  form  of  MSP{k)  machines, 
and  then  focused  our  attention  on  a  particular  kind  of  MSP(k)  machine, 
namely  those  which  contained  no  cyclts.  We  showed  that  if  a  grammar  G  could 
be  parsed  by  a  cycle-free  MSP(k)  machine  M,  then  we  could  derive  an  LL(k) 
grammar  T^(G),  whose  nature  depended  on  M  and  G,  and  which  was  equivalent 
to  G.  We  examined  the  properties  of  T^(G),  especially  the  kinds  of  transla¬ 
tions  it  could  support,  and  found  ways  to  make  it  a  more  manageable  size. 
Finally,  we  tried  to  ge^  same  feeling  for  the  class  of  grammars  that  can  be 
so  transformed,  and  gained  some  insight  into  the  problem  of  actually  finding 
the  cycle- free  MSP(k)  machine  for  such  grammars.  While  we  undoubtedly  left 
some  questions  unanswered,  we  feel  that  we  have  in  large  measure  succeeded  at 
the  task  which  we  set  ourselves,  namely  to  discover  and  study  a  new  transforma¬ 
tion  to  convert  non-LL(k)  grammars  into  LL(k)  form.  But  in  a  sense,  the  most 
satisfying  and  exciting  part  of  this  research  has  been  the  fact  that  a  large 
number  of  collateral  issues  have  been  raised  in  the  course  of  the  prepara¬ 

tion  of  this  work.  It  is  sometimes  said  thf.t  an  important  '-'’asure  of  a  piece 
of  reasearch  is  not  the  number  of  old  questions  it  answers,  out  the  number  of 
interesting  new  questions  which  it  asks.  We  sha1’  now  consider  some  unanswered 
questions  that  are  related  to  the  work  reported  in  this  thesis.  These  range 
from  speculations  on  mild  generalizations  of  our  work  to  rather  extensive  modi¬ 
fications  and  expansions  of  our  basic  ideas. 


